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(57) Abstract 

A video coding device capable of adaptively processsing input video data according to property of the data and realizing effective 
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above-mentioned object can be realized by the provision of a transferring-order deciding and rearranging portion (104) that can prepare 
an integrated component unit by forming combinations of subband-based frequency-coefficients of respective components Y, U and V and 
can change the number of respective elements of respective components Y, U and V. 



BNSDOCID: <WO 9B54903A1_I_> 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cdte d* I voire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







BNSDOCID: <WO 9854903A1_L> 



WO 98/54903 





PCT/JP98/02349 



DESCRIPTION 



VIDEO CODING DEVICE AND VIDEO DECODING DEVICE 



Technical Field 



The present invention pertains to the field of 
digital video processing and relates to a video coding 
device for efficiently encoding video data and a video 
decoding device for decoding video data coded by the 
video coding device. 



Recently, there has been proposed a subband coding 
method that can efficiently encode and decode video 
signals. The well-known high-efficient subband encoding 
method is used to decompose an input image into 
frequency bands by a bank of band-decomposing filters. 
The band-decomposing filter-bank is a one-dimensional 
filter-bank that can serve as a two-dimensional band- 
decomposing filter-bank by repeating processing the 
input image in horizontal and vertical directions. This 
method was reported by Fu j i i , Noumura. "Topics on 
Wavelet Transform" in a Report of "TECHNICAL REPORT of 
IEICE, IE92-11, 1992". 



Background Art 
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In the prior art, a subband image as shown in 
Fig. IB obtained by conducting two-dimensional subband 
decomposition three times. The first two-dimensional 
subband decomposition obtains a horizontal high-pass 
and a vertical low-pass band, a horizontal low-pass and 
vertical high-pass band and a horizontal and vertical 
high-pass band, which are designated by HL1 , LH1 and 
HH1 respectively. A horizontal and vertical low-pass 
band obtained by the first decomposition is further 
subjected to two-dimensional band-decomposition by 
which three subbands HL2 , LH2 and HH2 are obtained. 

A horizontal and vertical low-pass subband obtained 
by the second decomposition is further subjected to 
third two-dimensional subband decomposition by which 
three subbands HL3 , LH3 and HH3 and a horizontal and 
vertical low-pass subband LL3 are obtained. A Wavelet- 
converting filter-bank or a band-decomposing and 
synthesizing filter-bank may be used as the band- 
decomposing filter-bank. Thus, the decomposed subband- 
images are of a hierarchical (layer) structure from 
low-frequency band to high-frequency band. 

Progressive image transmitting can be easily 
realized utilizing the hierarchical structure of the 
subband images. The progressive image transmitting 
method enables a video decoding device to reproduce a 
low-resolution image by using only a part of coded 
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data. The more coded data is reproduced, the higher 
resolution the decoded image has. Japanese Laid-Open 
Patent Publication ( TOKKAI HE I ) No. 8-242379 describes 
a system (referred hereinafter to as a prior art 
system) to realize the progressive image transmitting. 

A video coding device using in the prior art system 
comprises a subband decomposing portion for decomposing 
an input image into subband images by using two- 
dimensional decomposing filters, a coefficient coding 
portion for encoding coefficients of the decomposed 
subband images, a var i ab 1 e- 1 eng t h coding portion for 
performing variable-length coding of the coded 
coefficient data from the coefficient coding portion 
and a line-transmitting portion for transmitting a 
plurality of components composing the image per line at 
a time. The coefficient coding portion performs 
encoding the coefficients by using any one of various 
kinds of coding methods (e.g., DPCM coding, zero-tree 
coding, and scalar-quantizing coding). This process 
includes a quantizing step. 

The operation of the line transmitting portion will 
be described below in detail, by way of example, with 
an input image composed of three— component s Y (a 
luminance component) and U, V (chrominance components) 
and being conducted subband decomposition three times 
as shown in Fig. IB. Processing starts from a subband 
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LL3 , which gives the lowest resolution of the image. 

In the example, the line-transmitting portion 
transmits the components Y, U and V sequentially line 
by line in the order from the first line of the subband 
LL3 . Having transferred all lines of the subband LL3 , 
the portion transfers the components Y, U and V in the 
subbands LH3 , HL3 and HH3 respectively in the order: 
the component Y on the first lines of the subbands LH3 , 
HL3 and HH3 ; the component U on the first lines of the 
subbands LH3 , HL3 and HH3 ; the component V on the first 
lines of the subbands LH3 , HL3 and HH3 ; the component Y 
on the second lines of the subbands LH3 , HL3 and HH3 ; U 
on the second lines of LH3 , HL3 and HH3 ; V on the 
second lines of the subbands LH3 , HL3 and HH3 and so 
on. Having transmitted all lines of LH3 , HL3 and HH3 , 
the line transmitting portion transfers, in similar 
way, lines of LH2 , HL2 , HH2 and, then, lines of LH1, 
HL1, HH1 . The above-mentioned procedure of the line- 
transmitting portion is executed according to a 
programed flow. 

Orderly transmission of the components Y, U, V 
composing the image per line produces coded data having 
a hierarchical structure. 

The prior art video decoding device comprises a 
line receiving portion for receiving the coded data 
from the line-transmitting portion of the video-coding 
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device above-mentioned and rearranging the data to 
respective component groups, a variable-length decoding 
portion for decoding the rearranged variable-length- 
coded data, a decoded data counting portion for 
counting bits of data decoded by the variable-length 
decoding portion, a decoding truncating portion for 
comparing the number of the bits counted by the 
decoded-data counting portion with a preset threshold 
or an externally-given threshold to give a command for 
stopping the decoding operation of the variable-length 
decoding portion when the number of decoded bits 
exceeds the threshold, a data completing portion for 
compensating for lack of truncated data by adding zero 
when having truncated the decoding the coded data at 
the specified number of bits, a coefficient decoding 
portion for decoding coded coefficient data by 
reversing the same processing procedure of the 
coefficient coding portion and a subband synthesizing 
portion for synthesizing an image from the subbands 
through two-dimensional synthesizing filters. 

The video decoding device can thus reproduce an 
entire image from coded data having a hierarchical 
structure or a part thereof. 

The conventional video-coding and video-decoding 
system can realize progressive image transmitting by 
transmitting image components per line in an ascending 
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order starting from the lowest-resolution band-image. 
However, the prior art system encounters several 
inconvenient problems resulting from the fixed 
transfer-unit of a line. For example, an image composed 
of luminance component Y and chrominance components U 
and V may be easier recognized by transmitting only the 
component Y before the components U and V rather than 
transmitting all components as a unit. 

In this case, it is preferable to transfer the 
image components subband by subband, not by line. 
Furthermore, it is proved that an image composed of 
components R, G, B may be reproduced with better 
subjective image— qua 1 i ty at the decoding terminal when 
coded coefficients of the respective components R, G 
and B are transmitted one by one. This is because these 
components have substantially the same influence on the 
visual property. 

The prior art system presumes that components of an 
image have the same size. Therefore, it cannot be 
adaptable to an input image composed of different sizes 
of components in format of, e.g., 4:2:2 or 4:2:0. 

Furthermore, the prior art system presumes that 
respective components of an image have the same number 
of subbands and cannot be adaptable to an input image 
whose components are divided into different numbers of 
subbands . 
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Disclosure of Invention 



The present invention is directed to a system for 
effective progressive image transmitting by solving the 
foregoing problems involved in the prior arts. 
(1) Accordingly, an object of the present invention is 
to provide a video coding device, which is provided 
with a subband-decompos ing means for decomposing an 
image being composed of N (N> 2) kinds of luminance or 
chrominance components into subband images for each of 
components A" (l^n^N, where n is an integer) composing 
an image to be coded, coefficient coding means for 
encoding a frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
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the integrated component units by setting therein the 
frequency coefficients contained in the respective 
components A", which are all frequency-coefficients 
included in m (m^l) pieces of the respective 
components' subbands , when the components A" are have 
the same size and the same number of subbands. 
(2) Correspondingly, another object of the present 
invention is to provide a video decoding device, which 
is provided with a var i ab 1 e- 1 engt h decoding means for 
decoding var i ab 1 e- 1 engt h coded data, a decoded-data 
counting means for counting bits of each integrated 
component unit decoded by the var i ab 1 e- 1 engt h decoding 
means, a decoding truncating means for comparing the 
number of bits counted by the decoded-data counting 
means with an externally-given number of bits and 
giving a de codi ng— s t op command when the number of 
decoded bits exceeds the given number of bits, a 
component separating means for separating the decoded 
integrated component unit into respective components 
A", a data completing means for compensating for lack 
of truncated data by adding a specified value to each 
of the components composing a screenful image, data 
arranging means for arranging coded coefficient data 
separated by the component separating means into 
specified positions for respective components A", a 
coefficient decoding means for decoding coded- 
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coefficient data separated and arranged for respective 
components A n by the component separating means, and a 
subband synthesizing means for reproducing a decoded 
image by combining subbands of data decoded by the 
coefficient decoding means for respective components 
A", wherein the component separating means separates 
the integrated component unit as combinations of all 
frequency coefficients contained in m ( m^ 1 ) subbands 
for respective components A" when the respective 
components A" have the same size and the same number of 
subbands . 

The integrated component units contains all 
frequency coefficients in m ( m£ 1 ) respective subbands 
of respective component A". Therefore, specified 
subbands of the image components such as luminance 
signal Y and chrominance signals U and V that may have 
different levels of influence on human visual property 
can be transmitted first to enable one to recognize a 
summary of the image at an earlier stage of decoding at 
the decoding side. When a codable image is known to be 
of higher resolution in a specified direction, the 
coding device can transmit first coded coefficients of 
higher-resolution-direction subbands and the decoding 
device can decode those coded coefficients, terminate 
the decoding in the midway of decoding all coded data 
and reproduce the image from only data decoded till 
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that time to improve subjective-image quality of the 
image . 

(3) Another object of the present invention is to 
provide a video coding device, which is provided with a 
subband-decompos ing means for decomposing an image 
being composed of N (N^ 2) kinds of luminance or 
chrominance components into subband images for each of 
components A" (l^n^N, where n is an integer) composing 
an image to be coded, coefficient coding means for 
encoding a frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by setting therein the 
frequency coefficients included in the respective 
components A" as m (m^l) pieces of frequency- 
coefficients contained at the same relative positions 
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in m (m^l) pieces of the respective components' 
subbands when the components A" have the same size and 
the same number of subbands. 

(4) Correspondingly, another object of the present 
invention is to provide a video decoding device, which 
is provided with a variable-length decoding means for 
decoding variable— length coded data, a decoded— data 
counting means for counting bits of each integrated 
component unit decoded by the variable-length decoding 
means, a decoding truncating means for comparing the 
number of bits counted by the decoded-data counting 
means with an externally-given number of bits and 
giving a decoding-stop command when the number of 
decoded bits exceeds the given number of bits, a 
component separating means for separating the decoded 
integrated component unit into respective components 



A", a data completing means fo 
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n , a 


coefficient decoding means 


for decoding coded- 



coefficient data separated and arranged for respective 
components A" by the component separating means, and a 
subband synthesizing means for reproducing a decoded 
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image by combining subbands of data decoded by the 
coefficient decoding means for respective components 
A", wherein the component separating means separates 
the integrated component unit into combinations of m 
(m^l) pieces of frequency coefficients having the same 
relative positions in respective m (m^l) subbands of 
the respective components A** when the components A" 
have the same size and the same number of subbands. 

Therefore, the devices operate with integrated 
component units whose elements are m (m^l) pieces of 
frequency coefficients having the same relative 
positions in m (m^l) respective subbands of respective 
components A" and can decode those coded coefficients, 
terminate the decoding in the midway of decoding all 
the coded data and reproduce the image from only the 
data decoded till that time to improve subjective-image 
quality of the image when the image is composed of 
components R, G and B that have substantially almost 
the same influence on human visual property. 
(5) Another object of the present invention is to 
provide a video coding device, which is provided with a 
subband— decompos ing means for decomposing an image 
being composed of N (N^ 2) kinds of luminance or 
chrominance components into subband images for each of 
components A" (l^n^N, where n is an integer) composing 
an image to be coded, coefficient coding means for 
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encoding a frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by setting therein the 
different number of frequency-coefficients in the 
respective components A" according to each component 
size when the components A" are different in size and 
have the same number of subbands . 

(6) Correspondingly, another object of the present 
invention is to provide a video decoding device, which 
is provided with a var i ab 1 e- 1 engt h decoding means for 
decoding variable-length coded data, a decoded— data 
counting means for counting bits of each integrated 
component unit decoded by the variable-length decoding 
means, a decoding truncating means for comparing the 
number of bits counted by the decoded-data counting 
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means with an externally-given number of bits and 
giving a decoding— stop command when the number of 
decoded bits exceeds the given number of bits, a 
component separating means for separating the decoded 
integrated component unit into respective components 
A", a data completing means for compensating for lack 
of truncated data by adding a specified value to each 
of the components composing a screenful image, data 
arranging means for arranging coded coefficient data 
separated by the component separating means into 
specified positions for respective components A n , a 
coefficient decoding means for decoding coded- 
coefficient data separated and arranged for respective 
components A" by the component separating means, and a 
subband synthesizing means for reproducing a decoded 
image by combining subbands of data decoded by the 
coefficient decoding means for respective components 
A", wherein the component separating means separates 
the integrated component unit as combinations of 
different pieces of frequency coefficients according to 
respective component sizes when the respective 
components A n are different in size and have the same 
number of subbands. 

The devices can be adapted to process an image 
whose luminance and chrominance components are 
different from each other by resolution, having the 
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great advantage over the conventional method that can 
be applied to an image whose components have the same 
resolution. This feature provided by the present 
invention is desirable in particular to digital image 
processing since many digital images are usually 
formatted to have higher resolution of the luminance 
component than that of chrominance component . 
(7) Another object of the present invention is to 
provide a video coding device, which is provided with a 
subband-decomposing means for decomposing an image 
being composed of N (N^ 2) kinds of luminance or 
chrominance components into subband images for each 
of components A" (ISnSN, where n is an integer) 
composing an image to be coded, coefficient coding 
means for encoding a frequency coefficient of the 
subband images, rearranging means for preparing 
integrated component units by combining the subbands 
included in respective components A" according to the 
coded coefficient data prepared by the coefficient 
coding means and rearranging the prepared integrated 
component units of the coefficient-coded data in an 
ascending order of subband image resolution, starting 
from the integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
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data, wherein the rearranging means prepares each of 
the integrated component units by combining the same 
number of low-resolution subbands and the different 
number of high-resolution subbands of the respective 
components A 1 * 1 when the components A" are different in 
size and different in the number of subbands. 
(8) Correspondingly, another object of the present 
invention is to provide a video decoding device, which 
is provided with a variable-length decoding means for 
decoding var i ab 1 e- 1 engt h coded data, a decoded-data 
counting means for counting bits of each integrated 
component unit decoded by the variable-length decoding 
means, a decoding truncating means for comparing the 
number of bits counted by the decoded-data counting 
means with an externally-given number of bits and 
giving a decoding-s top command when the number of 
decoded bits exceeds the given number of bits, a 
component separating means for separating the decoded 
integrated component unit into respective components 
A", a data completing means for compensating for lack 
of truncated data by adding a specified value to each 
of the components composing a screenful image, data 
arranging means for arranging coded coefficient data 
separated by the component separating means into 
specified positions for respective components A", a 
coefficient decoding means for decoding coded- 
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coefficient data separated and arranged for respective 
components A" by the component separating means, and a 
subband synthesizing means for reproducing a decoded 
image by combining subbands of data decoded by the 
coefficient decoding means for respective components 
A™ , wherein the component separating means separates 
the integrated component unit as combinations of the 
same number of low-resolution subbands and the 
different number of high-resolution subbands of 
respective components A" when the respective components 
A" are different in size and different in the number of 
subbands . 

The devices can be adapted to process an image 
whose luminance and chrominance components are 
different from each other by resolution and have 
different subband-decompo s i t i on levels, getting a great 
advantage over the conventional method that can be 
applied to an image whose components have the same 
resolution and the same number of subbands. This 
feature provided by the present invention is desirable 
in particular to digital image processing since many 
digital images are usually formatted to have higher 
resolution of the luminance component than that of 
the chrominance component and it is general to vary the 
subband-decompos i t ion level according to the resolution 
of the component. 
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(9) Another object of the present invention is to 
provide a video coding device, which is provided with a 
subband— decompos ing means for decomposing an image 
being composed of N (N£ 2) kinds of luminance or 
chrominance components into subband images for each 
of components A" (l^n^N, where n is an integer) 
composing an image to be coded, coefficient coding 
means for encoding a frequency coefficient of the 
subband images, rearranging means for preparing 
integrated component units by combining the subbands 
included in respective components A" according to the 
coded coefficient data prepared by the coefficient 
coding means and rearranging the prepared integrated 
component units of the coefficient—coded data in an 
ascending order of subband image resolution, starting 
from the integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by combining the same 
number of high-frequency subbands and the different 
number of low-frequency subbands of the respective 
components A" when the components A" are different in 
size and different in the number of subbands. 

(10) Correspondingly, another object of the present 
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invention is to provide a video decoding device, which 
is provided with a variable-length decoding means for 
decoding variable-length coded data, a decoded-data 
counting means for counting bits of each integrated 
component unit decoded by the variable-length decoding 
means, a decoding truncating means for comparing the 
number of bits counted by the decoded-data counting 
means with an externally-given number of bits and 
giving a decoding-stop command when the number of 
decoded bits exceeds the given number of bits, a 
component separating means for separating the decoded 
integrated component unit into respective components 
A", a data completing means for compensating for lack 
of truncated data by adding a specified value to each 
of the components composing a screenful image, data 
arranging means for arranging coded coefficient data 
separated by the component separating means into 
specified positions for respective components A", a 
coefficient decoding means for decoding coded- 
coefficient data separated and arranged for respective 
components A" by the component separating means, and a 
subband synthesizing means for reproducing a decoded 
image by combining subbands of data decoded by the 
coefficient decoding means for respective components 
A 1 * 1 , wherein the component separating means separates 
the integrated component unit as combinations of the 
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same number of high-resolution subbands and the 
different number of low-resolution subbands of 
respective components A" when the components A" are 
different sizes and different in the number of 
subbands . 

The devices can be adapted to process an image 
whose luminance and chrominance components have 
different resolution levels and different subband- 
decomposition levels, getting a great advantage over 
the conventional method that can be applied to an image 
whose components have the same resolution and the same 
number of subbands. Furthermore, these aspects of the 
present invention provide such a feature that each 
integrated component unit always reflects the ratio of 
numbers of respective components contained in an input 
image. This feature eliminates the need for decoding 
redundant data at the decoding side when decoding the 
amount of data according to the resolution of the 
display unit . 

(11) Another object of the present invention is to 
provide a video coding device, which is based on the 
device of (9) above-mentioned and further characterized 
in that the rearranging means prepares each of the 
integrated component units by combining lowest ones of 
resolution subbands of the respective components A" and 
different numbers of all other low-resolution-level 
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subbands of the respective components A n when the 
respective components A" are different in size and 
different in the number of subbands. 

(12) Correspondingly, the present invention also 
provides a video-decoding device, which is based on the 
device of (10) above-mentioned and further 
characterized in that the component separating means 
separates the integrated component unit into 
combinations of subbands for respective components, 
each combination composed of one lowest resolution 
subband and the different numbers of all other low- 
resolution subbands . 

The devices can first separate and transmit lowest- 
resolution subbands of respective components A" to 
first give a summary content of an image, making it 
possible to improve subjective quality of the 
reproduced image. 

Bri ef Description of the Drawings 

Fig. 1A is a view for explaining a subband coding 
method . 

Fig. IB is a view for explaining a subband coding 
method . 

Fig. 2 is a view for explaining a progressive video 
t ransmi ss ion sy st em . 
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Fig. 3 is a block-diagram of a prior art video 
coding device. 

Fig. 4 is a block-diagram of a prior art video 
decoding device. 

Fig. 5 depicts a sequence of transferring subband 
image coefficients according to the prior art video 
coding device . 

Fig. 6 is a flow chart for explaining the operation 
of a prior art video coding device. 

Fig. 7 is a block-diagram of a video coding device 
which is a first embodiment of the present invention. 

Fig. 8 is a block-diagram of a video decoding 
device which is a first embodiment of the present 
invent ion . 

Fig. 9 depicts an example of decomposing image 
components into subbands in the first embodiment of the 
present invent ion. 

Fig. 10 depicts an example of a sequence of 
transferring coefficients of subband images in the 
first embodiment of the present invention. 

Fig. 11 depicts another example of a sequence of 
transferring coefficients of subband images in the 
first embodiment of the present invention. 

Fig. 12 depicts an exemplified sequence of scanning 
coefficients in respective subbands in the first 
embodiment of the present invention. 
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Fig. 13 depicts an example of collecting a 
plurality of subbands into elements of an integrated 
component unit in the first embodiment of the present 
invent ion . 

Fig. 14 depicts an example of using coefficients of 
each line in each subband as an element of an 
integrated component unit in the first embodiment of 
the present invention. 

Fig. 15 is a flow chart depicting a procedure of 
operations of a video coding device which is the first 
embodiment of the present invention. 

Fig. 16 is a flow chart depicting a procedure of 
operations of a video coding device which is the first 
embodiment of the present invention. 

Fig. 17 depicts an example of decomposing image 
components into subbands in the second embodiment of 
the present invention. 

Fig. 18 depicts a relationship between coefficients 
of respective components in the second embodiment of 
the present invention. 

Fig. 19 depicts an example of an integrated 
component unit used in the second embodiment of the 
present invention. 

Fig. 20 depicts another example of an integrated 
component unit used in the second embodiment of the 
present invent ion. 
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Fig. 21 depicts an example of subband decomposition 
and an integrated component unit used in the third 
embodiment of the present invention. 

Fig. 22 depicts another example of subband 
decomposition and an integrated component unit used in 
the third embodiment of the present invention. 



Best Mode for Carrying Out the Invention 



Prior to explaining preferred embodiments of the 
present invention, prior art video coding device and 
video decoding device will be described below as 
references for the present invention. 

Recently, there has been proposed a subband coding 
method that can efficiently encode and decode video 
signals. The well-known high-efficient subband encoding 
method is used to decompose an input image into such 
frequency bands as shown in Fig. IB by a bank of band- 
decomposing filters as shown in Fig. 1A. The band- 
decomposing filter-bank shown in Fig. 1A is a one- 
dimensional filter-bank that can serve as a two- 
dimensional band-decomposing filter-bank by repeating 
processing the input image in horizontal and vertical 
directions. This method was reported by Fu j i i , Noumura. 
"Topics on Wavelet Transform" in a Report of "TECHNICAL 
REPORT of IEICE, IE92-11, 1992". 
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In Fig. 1A, there is shown a subband image obtained 
by conducting two-dimensional subband decomposition 
three times. The first two-dimensional subband 
decomposition obtains a horizontal high-pass and a 
vertical low-pass band, a horizontal low— pass and 
vertical high-pass band and a horizontal and vertical 
high-pass band, which are designated by HL1, LH1 and 
HH1 respectively. A horizontal and vertical low-pass 
band obtained by the first decomposition is further 
subjected to two-dimensional band-decomposition by 
which three subbands HL2 , LH2 and HH2 are obtained. 

A horizontal and vertical low-pass subband obtained 
by the second decomposition is further subjected to 
third two-dimensional subband decomposition by which 
three subbands HL3 , LH3 and HH3 and a horizontal and 
vertical low-pass subband LL3 are obtained. A Wavelet- 
converting filter-bank or a band-decomposing and 
synthesizing filter-bank may be used as the band- 
decomposing filter-bank. Thus, the decomposed subband- 
images are of a hierarchical (layer) structure from 
low-frequency band to high-frequency band. 

Progressive image transmitting can be easily 
realized utilizing the hierarchical structure of the 
subband images. As shown in Fig. 2, the progressive 
image transmitting method enables a video decoding 
device to reproduce a low-resolution image by using 
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only a part of coded data. The more coded data is 
reproduced, the higher resolution the decoded image 
has. Japanese Laid-Open Patent Publication (TOKKAI HEI ) 
No. 8-242379 describes a system (referred hereinafter 
to as a prior art system) to realize the progressive 
image transmitting, which structure is shown in Figs. 3 
and 4 . 

Fig. 3 shows a video coding device using in the 
prior art system and Fig. 4 shows a video decoding 
device using the system. The video coding device as 
shown in Fig. 3 comprises a subband decomposing portion 
2001 for decomposing an input image into subband images 
by using two-dimensional decomposing filters, a 
coefficient coding portion 2002 for encoding 
coefficients of the decomposed subband images, a 
variable-length coding portion 2003 for performing 
variable-length coding of the coded coefficient data 
from the coefficient coding portion 2002 and a line- 
transmitting portion 2004 for transmitting a plurality 
of components composing the image per line at a time. 
The coefficient coding portion 2002 performs encoding 
the coefficients by using any one of various kinds of 
coding methods (e.g., DPCM coding, zero-tree coding, 
and scalar-quantizing coding). This process includes a 
quantizing step. 

The operation of the line transmitting portion 2004 
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will be described below in detail, by way of example, 
with an input image composed of three-components Y (a 
luminance component) and U, V (chrominance components) 
and being conducted subband decomposition three times 
as shown in Fig. IB. Processing starts from a subband 
LL3 shown in Fig. IB, which gives the lowest resolution 
of the image. 

As shown in Fig. 5, the line-transmitting portion 
2004 transmits the components Y, U and V sequentially 
line by line in the order from the first line of the 
subband LL3 . Having transferred all lines of the 
subband LL3 , the portion transfers the components Y, U 
and V in the subbands LH3 , HL3 and HH3 respectively in 
the order: the component Y on the first lines of the 
subbands LH3 , HL3 and HH3 ; the component U on the first 
lines of the subbands LH3 , HL3 and HH3 ; the component V 
on the first lines of the subbands LH3 , HL3 and HH3 ; 
the component Y on the second lines of the subbands 
LH3 , HL3 and HH3 ; U on the second lines of LH3 , HL3 and 
HH3 ; V on the second lines of the subbands LH3 , HL3 and 
HH3 and so on. Having transmitted all lines of LH3 , HL3 
and HH3 , the line transmitting portion transfers, in 
similar way, lines of LH2 , HL2 , HH2 and, then, lines of 
LH1 , HL1, HH1. The above-mentioned procedure of the 
line-transmitting portion 2004 is illustrated by a 
flowchart of Fig. 6. 
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Orderly transmission of the components Y, U, V 
composing the image per line produces coded data having 
a hierarchical structure. 

Referring to Fig. 4, the video decoding device 
comprises a line receiving portion 2004 for receiving 
the coded data from the line-transmitting portion 2004 
of the video-coding device of Fig. 3 and rearranging 
the data to respective component groups, a variable- 
length decoding portion 2101 for decoding the 
rearranged variable-length-coded data, a decoded data 
counting portion 2102 for counting bits of data decoded 
by the variable-length decoding portion, a decoding 
truncating portion 2103 for comparing the number of the 
bits counted by the decoded-data counting portion with 
a preset threshold or an externally-given threshold to 
give a command for stopping the decoding operation of 
the variable-length decoding portion 2101 when the 
number of decoded bits exceeds the threshold, a data 
completing portion 2105 for compensating for lack of 
truncated data by adding zero when having truncated the 
decoding the coded data at the specified number of 
bits, a coefficient decoding portion 2106 for decoding 
coded coefficient data by reversing the same processing 
procedure of the coefficient coding portion 2002 of 
Fig. 3 and a subband synthesizing portion 2107 for 
synthesizing an image from the subbands through two- 
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dimensional synthesizing filters. 

The video decoding device can thus reproduce an 
entire image from coded data having a hierarchical 
structure or a part thereof. 

The conventional video-coding and video-decoding 
system can realize progressive image transmitting by 
transmitting image components per line in an ascending 
order starting from the lowest-resolution band-image. 
However, the prior art system encounters several 
inconvenient problems resulting from the fixed 
transfer-unit of a line. For example, an image composed 
of luminance component Y and chrominance components U 
and V may be easier recognized by transmitting only the 
component Y before the components U and V rather than 
transmitting all components as a unit. 

In this case, it is preferable to transfer the 
image components subband by subband , not by line. 
Furthermore, it is proved that an image composed of 
components R, G, B may be reproduced with better 
subjective image— qual i ty at the decoding terminal when 
coded coefficients of the respective components R, G 
and B are transmitted one by one. This is because these 
components have substantially almost the same influence 
on the visual property. 

The prior art system presumes that components of an 
image have the same size. Therefore, it cannot be 
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adaptable to an input image composed of different sizes 
of components in format of, e.g., 4:2:2 or 4:2:0. 

Furthermore, the prior art system presumes that 
respective components of an image have the same number 
of subbands and cannot be adaptable to an input image 
whose components are decomposed into different numbers 
of subbands. 

Referring now to accompanying drawings, a video- 
coding device and a video decoding device according to 
the present invention will be described below in 
detai 1 . 

Fig. 7 is a block diagram showing a video coding 
device which is a first embodiment of the present 
invention. As shown in Fig. 7, the first embodiment of 
the present invention includes a subband decomposing 
portion (subband decomposing means) 101, a coefficient 
coding portion (coefficient coding means) 102 and a 
variable-length coding portion (variable-length coding 
means) 103, which are similar in construction to 
portions 2001, 2002 and 2003, respectively, of Fig. 3. 

In Fig. 7, numeral 104 designates a transfer-order 
deciding and rearranging portion (rearranging means) 
that decides an integrated component unit prepared by 
combining luminance or chrominance elements from coded 
coefficient data provided from the coefficient coding 
portion 102 and arranges coded coefficient data of 
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subbands in a transmitting order starting from the 
coded coefficient data of the lowest — resolution 
subband. Whereas the prior art video-coding device 
rearranges coded coefficient data in the transmitting 
order after variable-length coding of the data, the 
video-coding device according to the present invention 
rearranges the coded coefficient data in the 
transmitting order before variable-length coding of the 
data . 

This makes it possible to conduct variable-length 
coding of the coded coefficient data by, e.g., an 
arithmetic coding method besides the Huffman coding 
method. According to the present invention, it is also 
possible to conduct rearrangement of the coded 
coefficient data after variable-length coding as the 
prior art device does. The operation of the first 
embodiment is described below with an input image 
composed of three components Y (luminance), U 
(chrominance) and V (chrominance), which is the same as 
that used in the prior art device. In this 
embodiment, these components have the same resolution, 
i.e., the same image sizes. 

An integrated component unit may be prepared from 
coefficient-coded data by combining elements Y, U and 
V. The following example is an integrated component 
unit that is prepared of subbands of Y, U and V. 
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Fig. 9 shows coefficients of subband images 
obtained through performing three times of subband- 
decompos i t ion of respective image components Y, U and 

V ((a), (b) and (c) part of Fig. 9, respectively). The 
image sizes of the components Y, U and V are equal to 
each other. Figs. 10 and 11 show the order of 
transmitting the subband image coefficients of Y, U and 

V ((a), (b) and (c) part of both of Fig. 10 and Fig. 11) 
in Fig. 9, respectively. Characters Y, U and V with 
numeral suffixes in both of Figs. 10, 11 denote the 
order of transmitting subbands of respective 
component s . 

In the first embodiment, an integrated component 
unit is composed of subbands and contains a set of the 
same-resolution subbands of respective components, that 
is: (Yi, Ui, Vi) where i= 1 to 10 (the order of 
subbands to be transferred) . 

The transmitting order is as follows: (Y a , Ui, V x ) , 
(Ye, U 2 , V*), ... , (Yio, IKo, V 1G ). 

In the case of Fig. 10, a coefficient of high 
resolution in a horizontal direction is transferred 
before a coefficient of high resolution in vertical 
direction, while in the case of Fig. 11, a coefficient 
of high resolution in a vertical direction is 
transferred before a coefficient of high resolution in 
horizontal direction. Accordingly, the resolution of 
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the image reproduced at the decoding side can be 
improved first in the horizontal direction in the case 
of Fig. 10 and first in the vertical direction in the 
case of Fig . 11. 

Processing of coefficients within a subband may be 
performed in any of the orders shown in Fig. 12. 

The subband is horizontally scanned from above left 
to below right (part designated by (1) of Fig. 12) or 
vertically scanned from above left to below right (part 
(2) of Fig. 12) or scanned spirally from the center of 
the subband to the outside thereof (part (3) of Fig. 
12). In Fig. 12, coefficients in a subband are 
processed by one scanning as shown in each part (a) of 
parts (1) to (3) and coefficients in a subband are 
processed by scanning twice as shown in each parts (b) 
and (c) of the parts (1) to (3) respectively. 

In part (c) in Fig. 12, each arrow shows a 
coefficient or a set of plural (t) coefficients. A set 
of (t) coefficients is transferred in the direction 
indicated by the arrow, a subsequent set of (t) 
coefficients is not transferred and a further 
subsequent set of (t) coefficients is transferred in 
the direction indicated by the arrow. This steps are 
repeated in first scanning process. The coefficients 
left not-transferred in the first scanning process 
(shown as arrows with real lines) are transferred in 
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the second scanning process (shown as arrows with 
broken lines) in the similar way as in the first 
scanning process. 

As shown in parts (4) and (5) of Fig. 12, it is 
also possible to process coefficients of a subband by 
scanning three or more times. For example, a 
horizontal interval and a vertical interval between 
coefficients to be processed by n-th scan are expressed 
as dx and dy respectively. If dx=dy=l , all coefficients 
are encoded by one raster scan. If dx=dy=2 and 
horizontal and vertical positions of codable 
coefficients are determined as (y, x), the first 
scan encodes coefficients at the positions (0, 0), (0, 
2), (0, 4). ..(2, 0), (2, 2), (2, 4) ... (4, 0), (4, 
2 ) , ( 4 , 4 ) 

the second scan encodes coefficients at the positions 
(0, 1), (0, 3), (0, 5). ..(2, 1), (2, 3), (2, 5) ... (4, 
1) , (4, 3) , (4, 5) . . . ; 

the third scan encodes coefficients at the positions 
(1, 0), (1, 2), (1, 4). ..(3, 0), (3, 2), (3, 4) ... (5, 

0) , (5, 2) , (5, 4) . . . ; 

the fourth scan encodes coefficients at the positions 
(I, 1), (1, 3), (1, 5). ..(3, 1), (3, 3), (3, 5) ... (5, 

1) , (5, 3), (5, 5). In short, all coefficients are 
encoded by four scans. 

By generalizing this process as ( dx=DX , dy=DY ) , the 
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first scan in case of part (1) in Fig. 12 encodes 

coefficients, as shown part (4) in Fig. 12 at the 
positions of (0, 0), (0, DX ) , (0, 2*DX) ... ; 

the second scan encodes coefficients at the positions 
(0, 1), (0, DX+1), (0, 2*DX+1) ... ; 

the third scan encodes coefficients at the positions 
(0, 2), (0, DX+2)...(0, 2*DX+2) ... ; 



DX round scan encodes coefficients at the positions (0, 
DX-1), (0, 2*DX-1), (0, 3*DX-1) 

(DX+1) round scan encodes coefficients at the positions 
(1, 0) (1, DX) , (1, 2*DX) . . . ; 

(DX+2) round scan encodes coefficients at the positions 
(1, 1) (1, DX+1) , (1, 2*DX+1) . . . ; 

( DX*DY ) round scan encodes coefficients at the 
positions (DY-1, DX-1) (DY-1, 2*DX-1), (DY-1, 3*DX-1) 
... (2*DY-1, DX-1) (2*DY-1, 2*DX-1), (2*DY-1, 3*DX-1). 
In short, all coefficients are encoded by ( DX*DY ) 
scans. Part (2) in Fig. 12 is reverse to part (1) in 
Fig. 12 as to vertical and horizontal directions. The 
coefficients can be processed by scanning as shown in 
part (5) Fig. 12 at horizontal and vertical intervals 
of (dx=DX, dy=DY) . 

In comparison with coefficients encoded by raster 
scanning from the top left of a frame, the subband 
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coefficients encoded by scanning at intervals provided 
between the respective coefficients can reproduce an 
image whose content can be recognized at an earlier 
stage of decoding and which image can give better 
subjective impression by the effect of gradually 
improving the quality of the decoded image. 

An example of an integrated component unit 
containing a plurality of subbands included in the 
respective components is described below. Fig. 13 
shows another transmitting order of coefficients of 
the subband image of Fig. 9. In a part of (b) of Fig. 
13, there is presented an example of encoding a subband 
image in four layers. Blocks (Y 2 , Y,i, Y« ) , (U 2j U3 , 
LU) and (V 2 , V.,, V« ) of in the part (b) of Fig. 13 
consist each of three (m=3) subbands as shown in a part 
(a) of Fig. 13. 

Namely, three subbands (1, 2, 3) shown in the part 
(a) of Fig. 13 correspond to one element set in an 
integrated component unit. As these three subbands are 
treated as one element, three coefficients having the 
same relative positions in respective three subbands 
are treated as one set as shown in a part (c) of Fig. 
13 . 

Three coefficients existing at the same relative 
positions in the respective three subbands in Fig. 
13 are supposed as one coefficient, i.e., a 
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(coefficient Y HL i, coefficient Y LH i, coefficient Y HH i) 
are represented by a (coefficient ) . The component Y 
in each of the layers shown in the part (b) of Fig. 13 
is transferred first in the scanning order shown in 
Fig. 12. Similarly, the component U is transferred next 
and the component V is then transferred. Coefficients 
of subband images of the same resolution levels in 
horizontal, vertical and diagonal directions are 
transmitted together from the coding side, so the 
resolutions of a reproduced image in horizontal, 
vertical and diagonal directions are increased at a 
time at the decoding side. 

According to another method for treating three 
subbands as one element, a subband 1 is first 
transmitted completely, a subband 2 is then transmitted 
completely and a subband 3 is finally transmitted. 
Referring to Fig. 13, the subband 1 of the component Y 
is transmitted in the scanning order shown in Fig. 12. 
Next, the subband 2 of the component Y is transmitted 
in the same scanning order and then the subband 3 of 
the component Y is transmitted in the same scanning 
order. Subsequently, the subbands 1, 2 and 3 of the 
component U are transmitted one by one in the same 
scanning order as that for the component Y. Finally, 
the subbands 1, 2 and 3 of the components V are 
transmitted one by one in the same scanning order as 
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that for the component Y. 

An integrated component unit includes elements (the 
subband 1 of the component Y, the subband 2 of the 
component Y, the subband 3 of the component Y, the 
subband 1 of the component U, the subband 2 of the 
component U, the subband 3 of the component U, the 
subband 1 of the component V, the subband 2 of the 
component V, the subband 3 of the component V). In this 
case, resolution of an image reproduced at a decoding 
side is increased in a horizontal direction, vertical 
direction and diagonal direction in the described 
order. Transmission of these three subbands in the 
order of subband 2, subband 1 and subband 3 causes an 
increase in resolution of the image in the horizontal, 
vertical and diagonal directions in the described order 
at the decoding side. 

Several examples of transmitting subband 
coefficients may be selectably used. In case if a 
codable image is known to be of higher resolution in a 
specified direction, coefficients of a subband in the 
known high-resolution direction are transferred first 
at the coding side and the transferring order is 
rearranged at the decoding side to earlier reproduce 
the coefficients of the subband image, in the known 
high-resolution direction. This makes it possible to 
increase the quality of images in the decoding process 



WO 98/54903 





PCT/JP98/02349 



at the decoding side. In this instance, it is necessary 
to inform the decoding side of the transmitting order 
in which coefficients are encoded by placing such 
information in the coded data. 

The t r an smitting- order deciding/rearranging port ion 
104 shown in Fig. 7 decides integrated component units 
(in the case of Fig. 9) to be of subbands (Yi , IK , V* ) , 
where i= 1 to 10 indicating the order of transmitting 
subbands), rearranges subbands in the order of 

(Yi.Ui.Vi), (Ya.Ua.Vz) (Yio ,Uio , Vio ) as shown in 

Figs. 10 and 11 or in the order of (Yi,Ui,Vi), 
(Yz.Uz.Vz), (Y 4 ,LU,V 4 ) as shown in the part (b) of 

Fig. 13 and, then, outputs the rearranged coefficient- 
coded data to the var i ab 1 e- 1 engt h coding portion 104. 

Although the above-mentioned embodiment treats all 
coefficients in a subband as one group, it may also 
prepare an integrated component unit by using a 
coefficient or a plurality of coefficients in a subband 
as a group. The following example treats one line in a 
subband as a group of coefficients. 

Fig. 14 shows an integrated component unit 
consisting of horizontal lines one in each of the 
subbands, which is expressed as follows: 

(Y i (y),Ui(y),V i (y)), where Yi(y), Ui(y), V.fy) are' 
one-line data of respective subbands Y, U and V, i= 1 
to 10 indicates the order of transferring subbands and 
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y denotes each line number in subbands . 

In this instance, the order of transferring the 

integrated component units is as follows: 
(Yi(0), Ux(0), Vi(0)), (Yi(l), Ui(l), Vx(l)),..., 
(Y z (0), U 2 (0), V 2 (0)), (Y 2 (l), U 2 (l), V 2 (l)), 



(Y,o(0), Uio(O), Vxo(O)), (Yiojl), Uxo(l), V 10 (l)),... 
In Fig. 14, an integrated component unit consists 
of horizontal lines one in each of the subbands, which 
corresponds to the scanning order shown in the part (a) 
of part (1) of Fig. 12. Besides this, it is also 
possible to prepare an integrated component unit 
composed of vertical lines one in each of the subbands 
as shown in the part (a) of part (2) of Fig. 12. In 
this instance, components Y, U, V may be expressed each 
by one arrow. This processing is done on all subbands 
in the order shown in Fig. 10, Fig. 11 or the part (b) 
of Fig. 13. 

It is also possible to prepare an integrated 
component unit composed of coefficients one in each of 
the subbands, which is expressed as: 

(Yi(y,x), Ui(y.x), (Vi(y,x)), where Yi(x,y), Ui(x,y), 
Vi(x,y) are coefficients one in respective subbands Y, 
U and V, i= 1 to 10 indicates the order of 
transferring subbands, y denotes a position in vertical 
direction in a subband, x denotes a position in 
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horizontal direction in a subband. In this instance, 
the transmitting order may be any of the orders shown 
in parts (1), (2) and (3) of Fig. 12. The processing is 
made on all subbands in the order shown in Figs. 10, 
Fig. 11 or the part (b) of Fig. 13. 

Fig. 15 is a flow chart depicting an example of 
the operation of the transmitting order deciding and 
rearranging portion 104 of Fig. 7. In the shown case, 
the integrated component unit may be changed over from 
the subbands to coefficients (one or more groups of 
coefficient) or vice versa. The portion may be designed 
to operate by using only one of the two units. 

The video coding device according to the first 
embodiment of the present invention can prepare coded 
data having a hierarchical structure by decomposing an 
input image composed of a plurality of components Y, U 
and V into subband images and encoding the subband 
images in an ascending order of their resolution 
starting from the lowest-resolution subband. 

In comparison with the conventional device that 
integrates components Y, U and V according to only the 
scanning line base, the first embodiment of the present 
invention can perform adaptive encoding input video 
data in view of the data characteristics by applying 
integrated component units according to subband-based 
and/or coefficient-based integration of the components 
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Y, U and V. 

Referring now to Fig. 8, a video decoding device 
embodying the present invention will be described below 
in detail. This video decoding device is intended to 
decode video data prepared by the video coding device 
according to the first embodiment of the present 
invent ion . 

In Fig. 8, the video-decoding device comprises a 
variable-length decoding portion (variable-length 
decoding means) 201, a decoded-data counting portion 
(decoded data counting means) 202, a decoding 
truncating portion (decoding truncating means) 203, a 
data completing portion (data completing means) 205, a 
coefficient decoding portion (coefficient decoding 
means) 206 and a subband synthesizing portion ( subband 
synthesizing means) 207. These portions are similar in 
construction to the portions a variable-length decoding 
portion 2101, a decoded data counting portion 2102, a 
decoding truncating portion 2103, a data completing 
portion 2105, a coefficient decoding portion 2106 and a 
subband synthesizing portion 2107, respectively, of 
Fig. 4. 

In Fig. 8, numeral 204 designates a component 
separating portion (component separating means, 
arranging means) for separating coefficient-coded data 
rearranged by the transfer-order deciding and 
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rearranging portion of the coding device into data for 
respective components. The component separating portion 
204 rearranges coded data into respective component 
groups Y, U and V by inverting the procedure that the 
coding side did. 

Accordingly, the component separating portion 204 
has a memory (not shown) for storing respective 
components Y, U and V. This memory has basically the 
same capacity that the subband decomposing portion 101 
of the video coding device has. However, this portion 
may be designed to separate an integrated component 
unit into coefficients for respective component groups 
Y, U, V and output the separated coefficients to the 
coefficient decoding portion 206 on completion of 
separation of the integrated component unit. In this 
instance, the portion may have the memory enough to 
store the largest int egrat ed-component unit only. 

The component separating portion 204 may also be 
designed to work by successively separating and 
outputting coefficients of an integrated component unit 
to the coefficient decoding portion 206. In this case, 
the component separating portion 204 may have a memory 
enough to store a plurality of separated coefficients 
irrespective of the size of any integrated component 
unit to be separated. The separated coefficients 
outputted from the component separating portion 204 are 
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decoded by the coefficient decoding portion 206 and 
then stored in a memory (not shown) of the coefficient 
decoding portion 206. In this case, the coefficient 
decoding portion 206 includes decoded-data arranging 
means . 

Furthermore, it is also possible to write the 
decoded coefficients in a memory (not shown) for 
storing frequency-coefficients to be input into the 
subband synthesizing portion 207. 

As described above, the coefficients separated by 
the component separating portion 204 may be stored in a 
variety of memory means. For the convenience of further 
explanation, the component separating portion 204 
separates integrated component units by writing 
separated coefficients in its memory having the same 
capacity as the memory of the subband decomposing 
portion 204 of the video coding device has. 

For example, an integrated component unit composed 
of subband data (Yi ,Ui,Vi ) is decomposed into separate 
elements Yi , Ui and Vi respectively. The separated 
elements Y, U and V are written into corresponding 
subbands positions in a memory. Next, a unit (Y 2 , Y 2 , 
V 2 ) is decomposed into separate elements Y^ , U 2 and V 2 
that are then written in specified positions of the 
corresponding subbands in the memory for storing Y, U 
and V. This processing is done on all the subbands. 
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In this instance, the order of decoding 
coefficients in each subband is the same as described 
the process shown in Fig. 12. For example, the coding 
side performed scans as shown in the part (1) of Fig. 
12, so the decoding side must do scans as shown in the 
part (1) of Fig. 12. The application of this scanning 
method causes an image in the decoding process to have 
resolution increasing in the order of raster scanning 
from the top left to the down right. 

In a particular case when coefficients were encoded 
by scanning with spacing between them as shown in the 
part (4) or (5) of Fig. 12, an image being decoded can 
be easily recognized at an earlier stage of decoding as 
compared with the raster scanned image. The image may 
be gradually improved in resolution level, so the image 
may have better subjective-image quality. 

The operation of the component separating portion 
204 when processing an integrated component unit 
composed of one or more groups (sets) of coefficients 
in subbands is as follows: 

Assuming one line in a subband is considered as a 
group of coefficients, an integrated component unit is 
expressed as (Y±(y), Ui(y), V*.(y)) where Yi(y), U*(y), 
Vi(y) are one-line data of respective subbands Y, U and 
V, i = 1 - 10 (the order of transmitting subbands) and 
y designates a position in a vertical direction in a 
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subband . 

Y*(y), Ui(y), Vx(y) are separated from each other 
and written in positions <line y> of the corresponding 
subbands i in the memory for storing Y, U and V. This 
processing is done on all the lines in the subbands in 
the order from the lowest-resolution subband to the 
highest— resolution subband . 

When an integrated component unit is composed of 
coefficients selected one from each subband, it is 
expressed as: (Yi(y, x), Ui(y, x), Vi(y, x)) where 
Yx(y, x), Ui(y, x), Vi(y, x) are single-coefficient 
data of respective subbands Y, U and V, i = 1 — 10 (the 
order of transmitting subbands) and y designates a 
position in a vertical direction in a subband and x 
designates a position in a horizontal direction in a 
subband . 

Elements Y±(y, x)> U, (y, x), Vi(y, x) are separated 
from each other and written in positions (y, x) of a 
coefficient of the corresponding subbands i in the 
memory for storing Y, U and V. This processing is done 
on all the coefficients in the subbands from the 
lowest-resolution subband to the highest-resolution 
subband . 

As described before referring to Fig. 12, the 
decoding side applies the same coefficient-scanning 
method as the coding side used even if an integrated 
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component unit is selected by 1 line or by one 
coef f ic ient . 

The separate coefficients outputted from the 
component separating portion 204 are combined into 
respective component groups Y, U and V and then treated 
as respective groups. 

Fig. 16 is a flow chart depicting an exemplified 
operation procedure of the component separating portion 
204. In the instance shown in Fig. 16, the integrated 
component unit may be changed over from the subbands to 
coefficients (one or more groups of coefficient) or 
vice versa. The portion 204 may be provided with either 
one of the two integrated component units. 

Referring to Fig. 8, the operation of the data 
completing portion 205 will be described below in 
detai 1 : 

In this case, an integrated component unit is 
composed of one— line data in a subband. 

Now let us suppose that the decoding operation was 
stopped by the action of the decoding truncating 
portion 203 because the number of bits of the decoded 
data exceeded a threshold value when data for instance 
one-line data of integrated component unit shown in 
Fig. 14, e.g., (Yi(0), Ui(0), Vx(0)) (¥3(6), 
U3 ( 5 ) , V.-3 ( 5 ) ) has been decoded. In this instance, 
coefficients of the subbands 1 and 2 and coefficients 
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of the first line to the fifth line of the subband 3 
have been decoded but the remaining parts have no 
coef f i c i ent . 

Data completing portion 205 produces subband 
coefficients by putting 0 in remaining vacant parts 
where no data exist. This enables the coefficient 
decoding portion 206 and the subband synthesizing 
portion 207 to normally perform subsequent processing 
steps. Vacant data may also be replaced with any other 
value than 0. The provision of the decoded data 
counting portion 202, the decoding truncating portion 
203 and the data completing portion 205 enables the 
decoding side to truncate the decoding operation at any 
position by user's request. 

The data completing portion 205 do nothing while 
the number of bits of the decoded data does not exceed 
the threshold value. Accordingly, a maximally 
expressible value may be previously set as the 
threshold value in case of decoding all the coded data. 
The embodiment may be a system of Fig. 8, which in this 
instance omits the decoded data counting portion 202, 
the decoding truncating portion 203, the data 
completing portion 205. 

The coded data of the hierarchical structure, which 
represents an image composed of a plurality of 
components Y, U, V, can be decoded at the decoding 
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side. The coded data of the hierarchical structure 
allows the progressive decoding the coded data, whereby 
the quality of the entire reproduced image is 
sequentially improved. As compared with the prior art 
method limited to the integrated component units of 
lines one for each component, the present invention 
enables the system to conduct a variety of progressive 
reproduction of the coded image. For example, the 
present invention method can uniformly improve the 
resolution of the reproduced image in horizontal, 
vertical and diagonal directions, whereas the prior art 
method improves the image resolution in the horizontal 
direction before the other directions. 

A second embodiment of the present invention for 
coding and decoding an image composed of components 
having different sizes is described below. The second 
embodiment of the present invention is similar in 
construction to the first embodiment except for the 
operation of the transfer-order deciding and 
rearranging portion 104 (step of outputting an 
integrated component unit). Therefore, the same 
portions as the first embodiment are not explained 
further. Only the transfer-order deciding and 
rearranging portion 104 will be described below. 

For example, a format of 4:2:0 in which components 
U and V have a horizontal and vertical size being one- 
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half that of the component Y is used. Fig. 17 shows 
resultants of three times of band-decomposition of each 
the component Y whose size is Y*X and the components U 
and V whose respective size is Y/2*X/2 . In this case, 
the subbands U and V are each half the size of the 
component Y in horizontal and vertical directions. 
Respective components Y, U and V have the same number 
of subbands . 

Accordingly, no problem arises with subband-based 
integrated units when applying the same scanning method 
and the same transmitting order as the first embodiment 
used. However, there may be a trouble with an 
integrated component unit composed of one or a 
plurality of coefficient. 

The subbands Y, U, V are different in size each 
other, so the large-sized component Y may have excess 
coefficients if an integrated component unit is 
composed of the same number of coefficients per subband 
as described before in the first embodiment of the 
present invention. Therefore, the number of 
coefficients per component to be included in an 
integrated component unit is set according to the size 
ratio of respective components if the components are 
different from each other in size. 

A part (1) of Fig. 18 shows the corresponding of 
coefficients of respective components for an image in 
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the format of 4:2:0 as shown in Fig. 17. As the 
components Y, U and V have the same number of subbands 
per component, the order of transferring the integrated 
component units is the same as shown in Fig. 10 for the 
first embodiment of the present invention. In this 
instance, the ratio of the horizontal and vertical 
lengths of the components Y, U and V is of 2: 1: 1, so 
one coefficient of each component U or V corresponds to 
4 coefficients of the component Y. Accordingly, the 
numbers of coefficients for the components Y, U , V to 
be included in an integrated component unit is 
determined according to the ratio of 4: 1: 1. 

The integrated component units prepared on the 
basis of the subbands of respective components contain 
coefficients of the components Y, U and V at the ratio 
of 4:1:1 and are processed in the same manner as 
described before for the first embodiment of the 
present invent ion. 

Accordingly, an integrated component unit for one 
line per subband is prepared to contain two lines Y, 
one line U and one line V as shown in Fig. 19 while an 
integrated component unit for a coefficient group per 
subband is prepared to contain 2*2 coefficients Y, one 
coefficient U and one coefficient V as shown in Fig. 
20. In Fig. 19, numerals suffixed one to component data 
Y, U and V indicate, by way of example, the order of 
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transferring the data within the respective integrated 
component units. The integrated component units for Y, 
U and V are processed one by one for one subband. On 
completion of processing one subband, the process 
advances to processing another subband in the order 
shown in the part (1) of Fig. 18. The order of 
transferring the subbands may be either one of those 
shown in Figs. 11 and the part (b) of Fig. 13. 

With differently sized components Y, U and V, the 
video decoding side can decode the coded data 
transmitted from the coding side by changing the 
numbers of data of components contained in an 
integrated component unit according to the ratio of 
horizontal and vertical sizes of the components and 
writing the data in the corresponding positions in a 
memory . 

The same scanning method and the same transferring 
order as described for the first embodiment can be 
applied in this embodiment when working with the 
subband-based integrated component units. 

The second embodiment of the present invention is 
similar in construction and function to the first 
embodiment except for the operation of the transfer- 
order deciding and rearranging portion 204 (step of 
separating an integrated component unit into respective 
components and writing separated data in corresponding 
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subbands in a memory). Therefore, the further 
description is omitted. 

In the second embodiment of the present invention, 
it is possible to give coded data a hierarchical 
structure even if components of an image have different 
sizes. The decoding side can decode entire decoded data 
and can also obtain an entire reproduced image from a 
part of the coded data. 

Although the second embodiment has been described 
with only an image having components whose horizontal 
and vertical size ratio is 2:1:1, it can treat other- 
size ratios of image components in the similar manner 
as described above. 

A third embodiment of the present invention is 
adaptable to the case of processing image components 
being different in size and decomposed into different 
numbers of decomposition levels by the subband 
decomposing portion 101 of Fig. 7. This embodiment of 
the present invention is similar to the first 
embodiment except for the operation of the transfer- 
order deciding and rearranging portion 104 (step of 
outputting an integrated component unit). Therefore, 
the same portions are not explained further. Only the 
transfer-order deciding and rearranging portion 104 
will be described below. 

Referring to Fig. 21, this embodiment is described 
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by way of example with an input image whose components 
U and V have horizontal and vertical lengths being one 
half of those of the component Y. A part designated by 
(1) of Fig. 21 shows the results of decomposing the 
component Y three times and the components U and V 
twice respectively. As the number of subbands of Y 
differs from that of U and V, the third embodiment 
cannot use the transferring methods described for the 
first and second embodiments and so uses the following 
method of transferring the subbands. 

An integrated component unit composed of subbands 
of respective components Y, U and V is first described. 
In this instance (first example), combinations of 
three-component subbands Y, U and V, prepared from 
sevens of low-resolution subbands of respective 
components Y, U and V as shown in a part (2) of Fig. 
21, are transferred one by one to the variable-length 
coding portion 103 of Fig. 7, then three remaining 
subbands of the component Y are transferred 
independently one by one to the variable-length coding 
portion 103. In short, each one of low-resolution 
subbands of respective components Y, U and V, zero 
pieces of high— reso lut ion subbands of the components U 
and V and only one of high-resolution subband of the 
component Y compose respective integrated component 
units to be output. 
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Accordingly, the transferring order is expressed as 
(Y 1 ,Ui > Vi),(Y 2 ,U 2 ,V 2 ) > ...,(Y 7 ,U7,V 7 ),(Ye),(Y 9 ),(Y a o). 
In this instance, the subbands of respective Y, U, V 
component (Yi,Ui,V A ), where i=l to 7 designates the 
transferring order, have the same sizes, so the same 
scanning method as used in the first embodiment can be 
applied to the case that integrated component unit is 
subbands or one or more groups of coefficient. 

Subbands of components Y, U and V are combined by 
threes as shown in the part (b) of Fig. 13 to form 
combinat ions : 

(Yi.Ui.Vi), (Y 2 ,Y.3,Y 4 , U 2| Ua,U4, V2.V3.V4), (Y & , Y e ,Y 7> 
Us ,Ug ,Uv , V* ,Ve ,V T ) , (Ye) , (Yq) , (Y10) . 

Another example (second example) is shown in Fig. 
22. The subband images shown in a part (1) of Fig. 22 
are obtained as the result of decomposing the component 
Y three times and the components U and V twice 
respectively. In this instance, integrated component 
units to be transmitted are formed by combining a set 
of four low-resolution subbands Y with two low- 
resolution subbands U and V as shown in a part (2) of 
Fig. 22 and by combining high-resolution subbands of 
the components Y, U and V with each other by ones as 
shown in a part (3) of Fig. 22. 

The order of transferring the subbands is as 
follows: (Yi, Y 2 , Y3, Y«, Ui, Vj, (Y s , U 2 , V 2 ) , (Ye, 
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U 3f Va), (Y 7 , U 4 , V 4 )> (Ye, U 5| V s ), (Y 9 , Ue, Ve), 

(Yxo , Ut, V 7 ). 

In the second example, three components have 
corresponding subbands having different sizes, so the 
same scanning method as used in the second embodiment 
can be applied to each integrated component unit. 

Subbands (Yr,, Y e , Y 7 ) shown in the part (3) of Fig. 
22 corresponds to a vertical-resolution subband, a 
horizontal-resolution subband and a diagonal-resolution 
subband respectively and have the same resolution 
levels. Therefore, each integrated component unit can 
be prepared by combining respective high-resolution 
subbands of respective components Y, U and V by threes 
for each component. This is the third example of 
preparing an integrated component unit. Similarly, an 
integrated component unit may be prepared of 
combinations (Yo , Yc, , Yxo), (U*, U 3 , LU ) , (U^, Ue , Uv ) , 
(V*, V 3 , V«) , (V*, v 6 , V 7 ) . 

In this instance, the transmitting order is as 
follows: (Yx, Y*, Y 3 , Y 4 , Ux , Vx ) , (Y & , Ye, Y 7 , , U 3 , 
U 4 , V 2 , V 3 , ) , (Ye, Y 3 , Yxo, U 5 , Ue, Uv , V 5 , V e , V 7 ). 

In the above-described example, subbands Yi, Y 2 , 
Y 3 , Y 4 , Ux , V 3 are selected as a plurality of low- 
resolution subbands with keeping the size ratio of Y, U 
and V (in this example, the horizontal and vertical 
size ratio is 2:1:1). Besides the above combination, 
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Yi, Y 2 , . . . , Y? , Ui, . . . , m , V, . . . , V 4 may also be 
selected as a plurality of the low-resolution subbands. 

A further fourth example which is another variety 
of above-mentioned second or third example is such that 
the lowest resolution subbands Y, U, V is extracted 
from respective groups of low-resolution subbands Yi , 
Y^,..., Y? , Ui , U/i, V* V 4 for respective 

components and separately transferred as shown in a 
part (4) of Fig. 22 and a set of remaining low- 
resolution subbands is then transferred like second or 
third example. In this instance, the set of the low- 
resolution subbands is a combination of six Y-component 
subbands, three U-component subbands and three V- 
component subbands as shown in a part (5) of Fig. 22. 
After transmission of all the low-resolution subbands, 
high-resolution subbands shown in a part (6) of Fig. 22 
are transferred. 

In the case of combining subbands one for each 
component (corresponding to the second example), the 
order of transferring the subbands is as follows: (Yi, 



Ui, Vx), (Y 2 , Y 3 , Y«, Y& , U 2 , V 2 )., (Yn, Ua, V 3 ) , (Y 7 , 

U«, V«), (Y*, Ub, Vb), (Yu, Ue, Ve), (Yxo, Uv , V 7 ) . 



In the case of combining subbands by threes for 
each component (corresponding to the third example), 
the order of transferring the subbands is as follows: 



(Yi, Ui, V*), (Y,, Y 3 , Y,, Y & , Y 6 , Y 7 , U 2 , U 3 , U* , V 2 , 
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Vr*, V 4 ), (Y«, Yo, Yxo, Us, Ue, Ut , V & , V e , V 7 ). 

For components having subband decomposition levels 
of not less than 4, subbands having resolution levels 
higher than that of a subband shown in a part (6) of 
Fig. 22 are the same in quantity for components Y, U 
and V. In this instance, integrated component units 
composed each of a combination of the same number of 
subbands Y, U, V are subsequently transmitted. 

In the above examples from first to fourth, the 
transmitting order of the three subbands which have the 
same resolution levels in horizontal, vertical and 
diagonal directions shall not be restricted. In other 
words, for three subbands which have the same 
resolution levels as shown in a part (a) of Fig. 13, 
the transmitting order of these three subbands may be 
not only subbandl , subband2 , subband3 , but also 
subband2 , subbandl, subband3 , for example. 

In the first example, the video decoding side can 
decode the coded data received from the coding side by 
separating components in each integrated component unit 
into groups of respective components Y, U and V, 
writing separated data in the corresponding subband 
areas in the memory and finally writing three highest- 
resolution subbands in the corresponding subband area 
of the memory. 

In the second example, the video decoding side can 
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decode the coded data received from the coding side by 
separating the subbands in each integrated component 
unit by reversing the process made by the coding side 
and writing the subbands of each component in the 
corresponding subband area in the corresponding memory. 
In this case, the third embodiment differs from the 
first and second embodiments by the fact that only the 
first integrated component unit is prepared from a 
plurality of the low-resolution subbands by the coding 
side and contains four subbands Y, one subband U and 
one subband V. 

As described above, the number of low-resolution 
subbands to be combined with each other can be freely 
selected. For example, an initial i nt egrated— component 
unit may contain seven Y-component subbands, four U- 
component subbands and four V-component subbands. 

In the third example, each integrated component 
unit can be decomposed into subbands of respective 
components Y, U and V by the same method as described 
for the second example and written into corresponding 
subband areas (for Y, U and V components) in the memory 
for decoding them. Since the high-resolution subbands 
Y, U and V have been combined by threes in an 
integrated component unit at the coding side, the high- 
resolution subbands Y, U and V are recorded by threes 
for each component in corresponding subband areas in 
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the memory for decoding them. This example differs from 
the second example by the above-mentioned feature. 

In the fourth example, the video decoding side can 
decode the coded data received from the coding side by 
separating the subbands in each integrated component 
unit in the same manner as shown for decoding device in 
second or third example and writing the subbands of 
each component in the corresponding subband area in the 
corresponding memory. This example differs from the 
second and third examples by the fact that the lowest- 
resolution subbands of the respective components Y, U, 
V are first stored in the corresponding subband areas 
(Y, U, V) in the memory for decoding and then low- 
resolution subbands other than the lowest-resolution 
subbands are stored in the corresponding areas (Y, U, 
V) in the memory. 

The video decoding device according to the third 
embodiment of the present invention is similar to that 
of the first embodiment except for the operation of the 
component separating portion 204 (step of separating 
each integrated component unit into respective 
components and writing the separated data in the 
corresponding areas in a memory). Therefore, further 
explanation is omitted. 

With integrated component units each composed of 
one or more groups of coefficients such as lines 
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instead of the subbands for respective components Y, U 
and V, the third embodiment can apply the same process 
as described before with the same case in the first and 
second embodiments based upon the same method as 
described for the integrated component unit composed 
the subbands for the respective components Y, U and V. 

In the third embodiment of the present invention, 
an image whose components have different sizes and 
different decomposition levels can be encoded so that 
coded data having a hierarchical structure is obtained 
at the coding side and an entire image is reproduced 
from the entire coded data or a part of the coded data 
at the decoding side. 

Although the third embodiment has been described by 
way of example with only an image having components 
whose size ratio is of 2:1:1, it can treat other size 
ratios of image components in the similar manner as 
described above. For example, an image whose components 
Y, U and V are the same in size and have different 
numbers of subbands can be encoded to have a 
hierarchical structure through the same process as 
described above in the third embodiment. The 
transmitting orders corresponding to those shown in 
Figs. 11 and 13 are also adopted besides the described 
order of Fig . 10 . 

The three embodiments of the present invention have 
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been described by way of example with the specified 
order of transferring the elements Y, U, V in the 
integrated component units but shall not be limited to 
that order . 



The present invention brings following advantageous 
ef f ect s . 

Firstly, the video coding device and the video 
decoding device according to the present invention 
operate with integrated component units whose elements 
are all frequency coefficients in m (m^l) respective 
subbands of respective component A" and can therefore 
transmit and decoded first specified subbands of the 
image components that may be components Y, U and V and 
have different levels of influence on human visual 
property, allowing one to recognize an essence of the 
image at an earlier stage of decoding at the decoding 
side. When a codable image is known to be of higher 
resolution in a specified direction, the coding device 
can transmit first coded coefficients of higher- 
resolution— di rec t i on subbands and the decoding device 
can decode those coded coefficients, terminate the 
decoding in the midway of decoding all coded data and 
reproduce the image from only data decoded till that 
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time to improve subjective-image quality of the image. 

Secondly, the video coding device and the video 
decoding device according to the present invention 
operate with integrated component units whose elements 
are m (m^l) pieces of frequency coefficients having the 
same relative positions in m (m^l) respective subbands 
of respective components A" and can decode those coded 
coefficients, terminate the decoding in the midway of 
decoding all the coded data and reproduce the image 
from only the data decoded till that time to improve 
subjective-image quality of the image when the image is 
composed of components R, G and B that have 
substantially almost the same influence on human visual 
property . 

Thirdly, the video coding device and the video 
decoding device according to the present invention can 
be adapted to process an image whose luminance and 
chrominance components are different from each other by 
resolution, having a great advantage over the 
conventional method that can be applied to an image 
whose components have the same resolution. This 
feature provided by the present invention is very 
desirable in particular to digital image processing 
since many digital images are usually formatted to have 
higher resolution of the luminance component than that 
of chrominance component. 
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Fourthly, the video coding device and the 



video 



decoding device according to the present invention can 
be adapted to process an image whose luminance and 
chrominance components are different from each other by 
resolution and have different subband-decompos i t ion 
levels, getting a great advantage over the conventional 
method that can be applied to an image whose components 
have the same resolution and the same number of 
subbands. This feature provided by the present 
invention is very desirable in particular to digital 
image processing since many digital images are usually 
formatted to have higher resolution of the luminance 
component than that of chrominance component and it is 
general to use the different subband— decompos i t i on 
levels according to the components' resolution levels. 

Fifthly, the video coding device and the video 
decoding device according to the present invention can 
be adapted to process an image whose luminance and 
chrominance components have different resolution levels 
and different subband-decompos i t ion levels, getting a 
great advantage over the prior art method that can be 
applied to an image whose components have the same 
resolution and the same number of subbands. 
Furthermore, these aspects of the present invention 
provide such a feature that each integrated component 
unit always reflects the ratio of the number of 
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respective components contained in an input image. This 
feature eliminates the need for decoding, redundant 
data at the decoding side when decoding the amount of 
data according to the resolution of the display unit. 

Sixthly, in addition to the fifth advantageous 
effect above-mentioned, the video coding device and the 
video decoding device according to the present 
invention can first separate and transmit lowest- 
resolution subbands of respective components A" to 
first present the summary of an image, making it 
possible to improve subjective quality of the 
reproduced image . 
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CLAIMS 



1. A video coding device provided with a subband- 
decomposing means for decomposing an image being 
composed of N (N^ 2) kinds of luminance or chrominance 
components into subband images for each of components 
A" (l£n£N, where n is an integer) composing an image to 
be coded, coefficient coding means for encoding a 
frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by setting therein the 
frequency coefficients contained in the respective 
components A", said coefficients being all frequency- 
coefficients included in m (m^l) subbands for 
respective components A" when the components A" have 
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the same size and the same number of subbands . 

2. A video coding device provided with a subband- 
decomposing means for decomposing an image being 
composed of N (N^ 2) kinds of luminance or chrominance 
components into subband images for each of components 
A" (l^n^N, where n is an integer) composing an image to 
be coded, coefficient coding means for encoding a 
frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by setting therein the 
frequency coefficients included in the respective 
components A", said coefficients being m (m^l) pieces 
of frequency-coefficients contained at the same 
relative positions in m (m^l) pieces of the respective 
components' subbands when the components A" have the 
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same size and the same number of subbands . 

3. A video coding device provided with a subband- 
decomposing means for decomposing an image being 
composed of N (N^ 2) kinds of luminance or chrominance 
components into subband images for each of components 
A" (lSn^N, where n is an integer) composing an image to 
be coded, coefficient coding means for encoding a 
frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining frequency coefficients included in 
respective components A" according to the coded 
coefficient data prepared by the coefficient coding 
means and rearranging the prepared integrated component 
units of the coefficient-coded data in an ascending 
order of subband image resolution, starting from the 
integrated component unit including the coded 
coefficient data of the lowest resolution subband, and 
a variable-length coding means for performing variable- 
length encoding of the rearranged coefficient-coded 
data, wherein the rearranging means prepares each of 
the integrated component units by setting therein the 
different number of frequency-coefficients in the 
respective components A" according to each component 
size when the components A" are different in size and 
have the same number of subbands . 

4. A video coding device provided with a subband- 
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decomposing means for decomposing an image being 
composed of N (N^ 2) kinds of luminance or chrominance 
components into subband images for each of components 
A" (l^n^N, where n is an integer) composing an image to 
be coded, coefficient coding means for encoding a 
frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining the subbands included in respective 
components A" according to the coded coefficient data 
prepared by the coefficient coding means and 
rearranging the prepared integrated component units of 
the coefficient-coded data in an ascending order of 
subband image resolution, starting from the integrated 
component unit including the coded coefficient data of 
the lowest resolution subband, and a variable-length 
coding means for performing var i ab 1 e- 1 engt h encoding of 
the rearranged coefficient-coded data, wherein the 
rearranging means prepares each of the integrated 
component units by combining the same number of low- 
resolution subbands and the different number of high- 
resolution subbands for the respective components A" 
when the components A" are different in size and 
different in the number of subbands. 

5. A video coding device provided with a subband— 
decomposing means for decomposing an image being 
composed of N (N^ 2) kinds of luminance or chrominance 
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components into subband images for each of components 
A" (l^n5N, where n is an integer) composing an image to 
be coded, coefficient coding means for encoding a 
frequency coefficient of the subband images, 
rearranging means for preparing integrated component 
units by combining the subbands included in respective 
components A" according to the coded coefficient data 
prepared by the coefficient coding means and 
rearranging the prepared integrated component units of 
the coefficient-coded data in an ascending order of 
subband image resolution, starting from the integrated 
component unit including the coded coefficient data of 
the lowest resolution subband, and a variable-length 
coding means for performing variable-length encoding of 
the rearranged coefficient-coded data, wherein the 
rearranging means prepares each of the integrated 
component units by combining the same number of high- 
frequency subbands and low-frequency subbands being 
different in number for respective components A" when 
the components A" are different in sizes and different 
in the number of subbands. 

6. A video-coding device as defined in claim 5, 
wherein the rearranging means prepares each of the 
integrated component units by combining lowest ones of 
resolution subbands of the respective components A" and 
all other low-resolution subbands being different in 
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number for respective components A" . 

7. A video decoding device provided with a 
variable-length decoding means for decoding variable- 
length coded data, a decoded-data counting means for 
counting bits of each integrated component unit decoded 
by the variable-length decoding means, a decoding 
truncating means for comparing the number of bits 
counted by the decoded-data counting means with an 
externally-given number of bits and giving a decoding- 
stop command when the number of decoded bits exceeds 
the given number of bits, a component separating means 
for separating the decoded integrated component unit 
into respective components A", a data completing means 
for compensating for lack of truncated data by adding a 
specified value to each of the components composing a 
screenful image, data arranging means for arranging 
coded coefficient data separated by the component 
separating means into specified positions for 
respective components A", a coefficient decoding means 
for decoding coded— coe f f i c i ent data separated and 
arranged for respective components A" by the component 
separating means, and a subband synthesizing means for 
reproducing a decoded image by combining subbands of 
data decoded by the coefficient decoding means for 
respective components A", wherein the component 
separating means separates the integrated component 
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unit as combinations of all frequency coefficients 
contained in m(m^l) subbands for respective components 
A" when the components A" have the same size and the 
same number of subbands. 

8. A video decoding device provided with a 
variable-length decoding means for decoding variable- 
length coded data, a decoded-data counting means for 
counting bits of each integrated component unit decoded 
by the var i ab 1 e- 1 engt h decoding means, a decoding 
truncating means for comparing the number of bits 
counted by the decoded-data counting means with an 
externally-given number of bits and giving a decoding- 
stop command when the number of decoded bits exceeds 
the given number of bits, a component separating means 
for separating the decoded integrated component unit 
into respective components A", a data completing means 
for compensating for lack of truncated data by adding a 
specified value to each of the components composing a 
screenful image, data arranging means for arranging 
coded coefficient data separated by the component 
separating means into specified positions for 
respective components A", a coefficient decoding means 
for decoding coded-coefficient data separated and 
arranged for respective components A 1 " 1 by the component 
separating means, and a subband synthesizing means for 
reproducing a decoded image by combining subbands of 
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data decoded by the coefficient decoding means for 
respective components A 1 " 1 , wherein the component 
separating means separates the integrated component 
unit as combinations of m (m^l) pieces of frequency 
coefficients having the same relative positions in 
respective m (m^l) subbands of the respective 
components A~ when the components A" have the same size 
and the same number of subbands. 

9. A video decoding device provided with a 
variable-length decoding means for decoding variable- 
length coded data, a decoded— data counting means for 
counting bits of each integrated component unit decoded 
by the variable-length decoding means, a decoding 
truncating means for comparing the number of bits 
counted by the decoded-data counting means with an 
externally-given number of bits and giving a decoding- 
stop command when the number of decoded bits exceeds 
the given number of bits, a component separating means 
for separating the decoded integrated component unit 
into respective components A", a data completing means 
for compensating for lack of truncated data by adding a 
specified value to each of the components composing a 
screenful image, data arranging means for arranging 
coded coefficient data separated by the component 
separating means into specified positions for 
respective components A", a coefficient decoding means 
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for decoding coded-coefficient data separated and 
arranged for respective components A" by the component 
separating means, and a subband synthesizing means for 
reproducing a decoded image by combining subbands of 
data decoded by the coefficient decoding means for 
respective components A n , wherein the component 
separating means separates the integrated component 
unit as combinations of the different number of 
frequency coefficients for respective components A" 
according to respective component sizes when the 
components A" are different in size and have the same 
number of subbands. 

10. A video decoding device provided with a 
variable-length decoding means for decoding variable- 
length coded data, a decoded— data counting means for 
counting bits of each integrated component unit decoded 
by the var i ab 1 e— 1 engt h decoding means, a decoding 
truncating means for comparing the number of bits 
counted by the decoded— data counting means with an 
externally-given number of bits and giving a decoding— 
stop command when the number of decoded bits exceeds 
the given number of bits, a component separating means 
for separating the decoded integrated component unit 
into respective components A", a data completing means 
for compensating for lack of truncated data by adding a 
specified value to each of the components composing a 
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screenful image, data arranging means for arranging 
coded coefficient data separated by the component 
separating means into specified positions for 
respective components A", a coefficient decoding means 
for decoding coded-coefficient data separated and 
arranged for respective components A" by the component 
separating means, and a subband synthesizing means for 
reproducing a decoded image by combining subbands of 
data decoded by the coefficient decoding means for 
respective components A", wherein the component 
separating means separates the integrated component 
unit as combinations of the same number of low- 
resolution subbands and the different number of high- 
resolution subbands for respective components A" when 
the components A" are different in size and different 
in the number of subbands. 

11. A video decoding device provided with a 
variable-length decoding means for decoding variable- 
length coded data, a decoded-data counting means for 
counting bits of each integrated component unit decoded 
by the variable-length decoding means, a decoding 
truncating means for comparing the number of bits 
counted by the decoded-data counting means with an 
externally-given number of bits and giving a decoding- 
stop command when the number of decoded bits exceeds 
the given number of bits, a component separating means 
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for separating the decoded integrated component unit 
into respective components A", a data completing means 
for compensating for lack of truncated data by adding a 
specified value to each of the components composing a 
screenful image, data arranging means for arranging 
coded coefficient data separated by the component 
separating means into specified positions for 
respective components A", a coefficient decoding means 
for decoding coded-coefficient data separated and 
arranged for respective components A" by the component 
separating means, and a subband synthesizing means for 
reproducing a decoded image by combining subbands of 
data decoded by the coefficient decoding means for 
respective components A", wherein the component 
separating means separates the integrated component 
unit as combinations of the same number of high- 
resolution subbands and different pieces of low- 
resolution subbands for respective components A" when 
the components A" are different in size and different 
in the number of subbands. 

12. A video-decoding device as defined in claim 11, 
wherein the component separating means separates the 
integrated component unit as combinations of subbands 
for respective components, each combination composed of 
one lowest resolution subband and different numbers of 
all other low-resolution subbands. 
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Abstract. The wavelet transform is a valuable tool in video processing 
because of its flexibility in representing nonstationary signals. Wavelet- 
based compression has the advantages of efficient decorrelation of im- 
age frames and reduced-complexity multiresolution motion estimation 
(MRME). We propose three techniques to improve motion estimation in 
a wavelet-based coder. First, we propose to use an adaptive threshold 
for coding the motion vectors of the high-pass subimages. Secondly, we 
propose a bidirectional motion estimation (BMRME) technique in the 
wavelet transform domain. In BMRME, we estimate the temporal (i.e., 
direction information) flags only for the blocks in the lowest-resolution 
subimages and use the same information for the corresponding blocks 
in the higher- resolution subimages. Finally, we propose a fast multire- 
solution motion estimation technique where the directional subimages at 
each level of the wavelet pyramid are combined into a single subimage. 
Multiresolution motion estimation is then performed on the newly formed 
subimages. The proposed techniques improve the coding performance 
significantly over the baseline MRME technique, in addition, they further 
reduce the computational complexity of the MRME technique. 

Subject terms: visual communications and image processing; video compression; 
motion estimation; wavelets; multiresolution motion estimation. 

Optical Engineering 35(1). 126-136 (January 1996). 



1 Introduction 

Digital video transmission is becoming increasingly impor- 
tant with the advent of broadband networks such as the In- 
tegrated Services Digital Network (ISDN), asynchronous 
transfer mode (ATM), etc. We note that digital video data 
are voluminous, and hence efficient video compression tech- 
niques are essential for video archival and transmission. The 
International Standard Organization (ISO) has recently pro- 
posed the MPEG standards for video compression. ' The stan- 
dard MPEG- 1 2 has been developed for a targeted data rate 
of 1 .5 Mbits/s. The standard MPEG-2 3 is an improved version 
of MPEG-l and is expected to be used in a variety of ap- 
plications. These standards employ a block-based motion 
estimation technique to reduce the temporal redundancy pres- 
ent in a video sequence. To further reduce the spatial redun- 
dancy present in the mot ion -compensated frames, the discrete 
cosine transform (DCT) is used. We note that motion esti- 
mation in MPEG is a computation-intensive task. In addition, 
DCT has the drawbacks of blocking artifacts, mosquito noise, 
and aliasing distortions at high compression ratios. 

Recently, the discrete wavelet transform (DWT) has he- 
come popular in video coding applications for several rea- 
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sons. 4 - 5 First, it is efficient in representing videos, which are 
in genera] nonstationary in nature. Secondly, wavelets have 
high decorrelation and energy compaction efficiency. 
Thirdly, blocking artifacts and mosquito noise are absent in 
wavelet-based coder, resulting in subjectively pleasing re- 
constructed images. Fourthly, aliasing distortion can be re- 
duced significantly with proper choice of wavelet filters. Fi- 
nally, the basis functions match the human visual system 
(HVS) characteristics. 

In DWT coding, the choice of the motion estimation al- 
gorithm is crucial and determines the coding performance 
and complexity of the coder/' Zhang and Zafar 7 have pro- 
posed a multiresolution motion estimation (MRME) tech- 
nique for estimating the motion vectors in the wavelet do- 
main. This technique estimates the motion vectors 
hierarchically from lower resolution to higher resolution sub- 
images and thus reduces the complexity significantly. How- 
ever, there is a potential to further improve the coding per- 
formance and the complexity of the MRME technique. 

In this paper, we propose several techniques to improve 
the coding performance of the baseline MRME technique. 
First, we propose to employ an adaptive threshold for coding 
the motion vectors in the MRME framework (AMRME). 
Since the correlation between the corresponding high-pass 
subimages of the neighboring frames is not significant, 
AMRME improves the coding performance of the MRME 
technique. We then propose a bidirectional motion estimation 



MULTI RESOLUTION MOTION ESTIMATION TECHNIQUES FOR VIDEO COMPRESSION 



(BMRME) technique in the wavelet domain. The proposed 
BMRME technique can be employed in the design of MPEG- 
like wavelet coder. Finally, we propose a fast multiresolution 
motion estimation (FMRME) technique, which has a superior 
coding performance at a reduced complexity. 

This paper is organized as follows. Section 2 provides a 
brief description of wavelets and MRME technique. In Sec. 3, 
the proposed algorithms are detailed. The simulation results 
are provided in Sec. 4. which is followed by conclusions in 
Sec. 5. 

2 Wavelets and MRME Technique 

A wavelet transform represents any arbitrary function as su- 
perposition of a family of basis functions called wavelets. A 
family of basis functions can be generated by translating and 
dilating the mother wavelet corresponding to that family. The 
main advantages of wavelets are: (1) compactly supported 
basis functions, (2) an adaptive time-frequency window, and 
(3) multiresolution capability. A forward DWT can be re- 
alized by using a two-channel filter bank as shown in Fig. 1 . 
The signal is passed through a low-pass filter (LPF) and a 
high-pass filter (HPF), and the outputs of the filter are dec- 
imated by two. For reconstruction, the coefficients are up- 
sampled and passed through another set of low-pass and high- 
pass filters. We note that for orthonormal wavelets, the LPF 
and HPF are quadrature mirror filters (QMFs). 

The 2-D DWT is usually calculated using a separable 
approach. 8 Figure 2 shows a three-level wavelet decompo- 
sition of an image 5, of size aXb pixels. In the first level of 
decomposition, one low-pass subimage (S 2 ) and three 
orientation-selectivehigh-passsubimages(H^, W 2 V , W? )are 
created. In second level of decomposition, the low-pass sub- 
image is further decomposed into one low-pass and three 
high-pass subimages (U' 4 ", W 4 \ Wf). This process is repeated 
on the low-pass subimage to form a higher-level wavelet 
decomposition. In other words, DWT decomposes an image 
into a pyramid structure of subimages with various resolu- 
tions corresponding to the different scales (see Fig. 3). The 
inverse wavelet transform is calculated in the reverse manner; 
i.e., starting from the lowest-resolution subimages, the 
higher-resolution images are calculated recursively. 

A typical wavelet-based video coding scheme is shown 
in Fig. 4. The coder consists of three major modules: wavelet 
transformation, motion compensation* and quantization. 
Each frame is wavelet-decomposed into three to five stages. 
The temporal redundancy that exists in a video sequence is 
removed by motion compensation. The error frames are then 
quantized for encoding. It should be noted that other wavelet- 
based coding schemes do exist. For example, motion com- 
pensation can be done before wavelet decomposition, or the 
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residual error frames can be further decorrelated by applying 
DCT. However, it has been concluded in Ref. 7 that the cod- 
ing scheme shown in Fig. 4 provides superior coding per- 
formance to the techniques just mentioned. 

Several motion estimation techniques have been reported 
in the literature. However, block matching is widely used 
because of its simplicity. In the block matching process, the 
current frame (/) of a video sequence is divided into blocks 
of size n Xn pixels as shown in Fig. 5. For each block (ref- 
erence block) in the current frame (r), the previous frame 
(/— 1) is searched within a neighborhood (search area) in 
order to obtain the best match block with respect to a pre- 
specified error criterion such as the mean squared error (MSE) 
defined as follows: 



MSE(k.v)=I] SlW + Mj+vV 
i-i j= i 

—p**u,v^p , 



ROJ )| 2 



(1) 



where R(iJ ) is a reference block's pixel in the current frame 
(/) and SCi + hj+v) is a candidate block's pixel within a 
search area in the previous frame (r- 1). We note that the 
search area consists of (n + 2p) X (n + 2p) pixels, where p is 
the maximum allowed displacement, and the total number of 
possible candidate blocks is (2/?+l) 2 . The most intuitive 
approach for block matching is to use the full search algorithm 
(FSA). For each reference block, all possible (2/?+ 1 ) 2 can- 
didate blocks are searched to obtain the best match, which 
is used as a prediction estimate for the reference block. The 
relative displacement between the reference block and the 
best match block constitutes the motion vector, which is trans- 
mitted to the receiver. We note, however, that the execution 
of the FSA is a computationally expensive procedure. Re- 
cently, several fast methods have been proposed for block- 
based motion estimation using logarithmic or hierarchical 
search. 9 However, these algorithms may converge to a local 
optimum which corresponds to the inaccurate prediction of 
the motion vectors and results in poor performance. 

Recently, a multiresolution motion estimation scheme 
(MRME) has been reported for wavelet-based video 
compression. 7 This approach exploits the multiresolution 
property of the wavelet pyramid in order to reduce the com- 
putational complexity of the motion estimation process. In 
the MRME scheme, the motion vectors at the highest level 
of the wavelet pyramid are first estimated using the conven- 
tional block-matching-based motion estimation algorithm. 
Then the motion vectors at the next level of the wavelet 
pyramid are predicted from the motion vectors of the pre- 
ceding level, which are refined at each step. For example, 
the motion vectors in , W 4 V , and W{ } are predicted from 
the motion vectors in H^. and W£\ respectively. The 
motion vectors of the lower-level pyramid can be estimated 
as follows: 



l^(.v.v) = 2Vi'(xy)+A'J(.v.v) . 

VV(.v. v) - I'i'U. v) + 2V;%v. v) + &$lx. v) . 



(2) 
(3) 



Fig. 1 1-D wavelet decomposition and reconstruction. 



where V,"(.v. v) represents the motion vector of the reference 
block centered at (.v.v) for the ^-orientation subimage for 
various levels of the pyramid. The incremental motion vector 
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Fig. 2 Wavelet-transformed image. 
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Fig. 3 Wavelet pyramid. 

A/fey) is calculated within a reduced search area centered 
at 2VS'(xv) and V;(jc,v) + 2^Uy) for level-2 and level- 1 
pyramids, respectively. The subimages of level-3, level-2, 
and level- 1 pyramids are divided into small blocks of size 
n X «, 2n X 2/i, and 4n X 4n, respectively. With this structure, 
the numbers of blocks in all the subimages are identical. As 
a result, there is a one-to-one correspondence between the 
blocks at various levels of wavelet pyramid. The search win- 
dows- for level-3, level-2, and level- 1 subimages are />, /?/2, 
and />/4, respectively. The refinement of the motion esti- 
mation process is shown in Fig. 6. Table 1 compares the 
complexity of the FSA with the MRME technique. It is ob- 
served that the hierarchical prediction and refinement pro- 
vides a superior motion estimation at a significantly reduced 
complexity. The complexity of the MRME technique has 
been calculated by assuming that the local motion vectors 
have been searched using FSA. The complexity of the MRME 
can further be reduced using a logarithmic or hierarchical 
search. 

3 Proposed Techniques 

The MRME technique described in the previous section re- 
duces the complexity compared to the FSA. In this section, 
we propose several techniques to further improve the coding 
performance of the MRME technique. In Sec. 3.1. we pro- 
pose to use an adaptive threshold while coding the motion 
vectors. A bidirectional MRME scheme is then proposed in 
Sec. 3.2. Finally, a fast MRME technique is proposed in 
Sec. 3.3. The performance evaluation of the proposed tech- 
niques is presented in the next section. 

3.1 Adaptive Thresholding Technique 

In the MRME technique, the motion vectors corresponding 
to all the subimages are calculated. However, we note that 
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Fig. 4 A simple wavelet-based video coder. 
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Fig. 5 Block-matching motion estimation process. 

the high-frequency subimages of consecutive frames are not 
correlated even though the consecutive frames are very sim- 
ilar. This is due to the following reasons. Firstly, the DWT 
is not translation-invariant, i.e.. if an image is shifted by a 
pixel, the transform coefficients will not be shifted by I pixel 
(in the wavelet domain). As a result, the object motion in the 
spatial domain does not correspond to the translation of coef- 
ficients in the wavelet domain. This is especially true for 
high-pass subimages. For low-pass subimages. great simi- 
larity among neighboring images has been observed, mainly 
because the basis functions are much smoother and thus the 
effects of translations are averaged. Secondly, the high-pass 
subimages represent only the edge information and hence 
they are expected to change rapidly from image to image 
even with a small change in scene. Although the high-pass 
subimages cannot be predicted well, the MRME approach 
performs well in practice because most of the information 
(or energy) is contained in the low-pass subimages. 

We propose to use an adaptive thresholding (AMRME) 
approach for estimating motion vectors in the high-frequency 
subimages. If the dissimilarity between the reference block 
and the best match is greater than a threshold, the block is 
discarded and the motion vectors corresponding to that block 
are not coded. A zero block is initialized at the corresponding 
place in the decoder. This thresholding improves the coding 
performance in two ways. First, the numberof motion vectors 
to be coded will be less, resulting in a reduced bit rate. Sec- 
ondly, the objective quality of the reconstruction will im- 
prove, since we are discarding the mismatched block. 

The choice of threshold is crucial in achieving good per- 
formance. Since we have assumed PSNR as the measure of 
quality, the adaptive threshold can be related to the energy 
of the reference block. If the threshold is made equal to the 
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Fig. 6 Multiresolution motion estimation. 



Table 1 Computational complexity of various motion estimation al- 
gorithms (in operations per pixel). 
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energy of the reference block, the image frames will be re- 
constructed with least distortion (in L 2 metric). However, we 
note that the rate-distortion function may not be optimal, since 
the entropy of the motion vectors may be high. Usually, a 
threshold less than the energy of the block provides better 
results. We define thresh^factor as the ratio of the threshold 
and the energy of the block. Depending on the video se- 
quence, a thresh-jactor between 0.6 and 0.9 provides good 
coding performance. It should be noted that the complexity 
of the AMRME is marginally higher than that of the MRME. 
since in the former one has to calculate the energy of the 
blocks. 

3.2 Bidirectional Motion Estimation 

To achieve good coding performance, the MPEG standard 1 
suggests that the frames be divided into three categories: A 
P. and B frames. The / frames are coded independently of 
the others. For P frames, the motion vectors are estimated 
by comparing the P frame with a previous / frame. The motion 
vectors and the prediction error frame are then sent to the 
decoder for reconstruction. The B frames are compared with 
the previous and the next reference frame (/ or P frame), and 
usually three types of predictions are used: forward predic- 
tion, backward prediction, and the average of two macro- 
blocks. 1 After the motion estimation is performed, the motion 
vectors are sent to the decoder. We note that for a fast-moving 
sequence, the motion vectors alone may not provide good 
subjective quality of the reconstructed video frames. Reduc- 
ing the block size, increasing the search window, and sending 
the error frames help in improving the subjective quality. 
However, they also increase the bit rate. Our simulation re- 
sults indicate that it is more beneficial to send the quantized 



error frames. Hence, we have implemented an adaptive 
wavelet-based video coder that sends the error frames cor- 
responding to B frames only when it is necessary, i.e.. when 
the difference between the reconstructed / and B frames ex- 
ceeds a certain threshold. 

Although bidirectional motion estimation improves the 
coding performance, it also increases the complexity of the 
coder. More memory is needed in the decoder to store the 
prediction reference frame (backward frame). An extra pic- 
ture delay is introduced. Most important, the computational 
complexity is twice that of unidirectional estimation. In the 
multiresolution framework, the motion vectors are estimated 
hierarchically from lower-resolution to higher-resolution 
subimages. Hence, there is a potential to reduce the com- 
putational complexity of bidirectional motion estimation with 
minimal degradation in performance. We assume a simple 
version of MPEG bidirectional motion estimation. Only the 
forward and backward motion estimation are considered in 
this paper. Although this may degrade coding performance 
marginally, there are several advantages. First, the encoder 
sends only one motion vector (instead of two in case of 
averaging). Secondly, the decoder will have reduced com- 
plexity, since it does not have to average the two macro- 
blocks. 

In the proposed bidirectional motion estimation, a tem- 
poral flag (TFLAG) will have one of two states: 0 or 1, 
depending on whether the best match is from the previous 
or from the next reference frame. Since a small region of an 
image is represented by blocks from various directional sub- 
images, it is highly probable that the temporal flags will be 
identical for all the corresponding blocks from the subimages 
of the particular orientation. Hence, in the BMRME tech- 
nique, we propose to calculate the temporal flags only in the 
four lowest-resolution subimages (i.e., 5 8 , W^. and 
Wf). The knowledge of the temporal flags of the high-pass 
subimages at the first level is then used as an estimate for 
the higher-resolution (lower-level pyramid) subimages in the 
same orientation. The procedure can be summarized as fol- 
lows: 

1. For each block position (jc. y), the motion vectors are 
calculated for all the subimages of the highest-level 
pyramid with respect to both the previous and the next 
frame. Initialize V£(xv). oe{H.V.D} with the better 
matching motion vectors (between the two frames). 

2. Set TFLAGJJ(a%.v) = 0, oe{//.V,D}, if the reference 
block matches better with the block from the previous 
frame. Else, initialize TFLAG«(jc. v) = I. 

3. For subimages corresponding to higher levels of the 
pyramid, TFLAG is no longer calculated. The motion 
vectors V;(xy), V%(x.y) are estimated using Eqs. (2) 
and (3) with respect to the previous frame if 
TFLAGi'(Jc.y) = 0; else the motion vectors are esti- 
mated with respect to the next reference frame. 

The complexity of the proposed algorithm is twice the 
complexity of the MRME technique (which is unidirectional) 
for the first four subimages, and identical to that of MRME 
for the other six subimages. The overall complexities of 
MRME and BMRME are compared in Table L The com- 
plexity is given in operations per pixel (for MSE criteria, an 
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operation includes one subtraction, one multiplication, and 
one addition). It is observed that the complexity of BMRME 
is marginally higher than that of MRME. 

To compare the bit rates of MRME and BMRME, we 
calculate the number of blocks in each subimage as 



II || Din blocks 
Low Pui Sufaimge S g 



M N 



(4) 



where M and /V are, respectively, the numbers of rows and 
columns of an image frame* and nx«is the block size at the 
highest level of the pyramid. The numbers of motion vectors 
(each vector has two components: horizontal and vertical) 
and temporal flags required to be transmitted are shown in 
Table 2. The dynamic ranges of the motion vectors (each 
component) and temporal flags are shown in column 2. We 
observe that the numbers of motion vectors in MRME and 
BMRME are identical. However, BMRME has the small 
overhead of transmitting the temporal flags (for the first four 
subimages). 

3.3 Fast MRME 

We recall that DWT decomposes an image into a pyramid 
structure of subimages. The motion vectors for different ori- 
entation subimages at each level of the wavelet pyramid ac- 
tually describe the same part of an object in a scene. In other 
words, the motion activities of the different wavelet sub- 
images at the same pyramid level are highly correlated be- 
cause they represent the motion on the same scale. In the 
proposed scheme (FMRME), the wavelet components at each 
level of the pyramid are combined into a single all -orientation 
subimage. In other words, W^, are combined into 

Wtf, whereas W?, Wf, W? are combined into W* , etc. This 
process is illustrated in Fig. 7. We note that the motion es- 
timation is performed only on the all-orientation subimages 
( W$ ; W$ % , etc.). This contrasts with the MRME scheme, 
where the motion estimation is performed separately on all 
the individual wavelet subimages (Wg. etc.). Hence 

the FMRME scheme exploits the correlation among the mo- 
tion vectors of the subimages of a pyramid level. 

The motion activities at different levels of the pyramid 
are highly correlated, since they actually characterize the 
same motion structure at different scales. Hence, in FMRME, 
the motion vectors of the all-orientation subimage of a lower- 
level pyramid are predicted and refined from the motion vec- 
tors of the all-orientation subimage of a higher-level pyramid. 
For a three-stage decomposition, the motion vectors of dif- 
ferent levels of pyramids can be expressed as 



V$ (.v. v ) = 2 V$ (jr. y ) + A? (x. y) , 

Vf (.v. v ) = V* (.v. y) + 2 VJ? (a. y ) + A f U y) 



(5) 



(6) 



At the receiver, the motion vectors of the all-orientation sub- 
image Wf are assigned as the motion vectors of W/ y . W- \ 
and W t n in order to reconstruct each wavelet subimage. i.e., 

V,"(a\ v) = V/ (a\ v) = V/ j (a\ v) = V> (.v. v) for / = 2.4.8 . 

(7) 

Table 2 compares the numbers of motion vectors (each vector 
has a horizontal and a vertical component) required to be 



1 1 1| mi blocks 
AH-On caution Subumje W A 



2a a 2a olocki 



4a i 4d block* 



Fig. 7 Wavelet all-orientation subimages. 

Table 2 Number of motion vectors in MRME and BMRME algo- 
rithms (m= number of blocks in a subimage). 



Level 


Dynamic range 

of vectors 


MRME 


FMRME 


BMRME 


BFMRME 


! 


-4, 4 


4'm 


2*m 


4»m 


2*m 


2 


-2. 2 


3*m 


l*m 


3*m 


l*m 


3 


-1.1 


3»m 


I'm 


3*m 


I'm 




0. 1 


Nil 


Nil 


2*m 





'Overhead due to lime flags 

sent to the decoder. We observe that the number of motion 
vectors is almost 40% of that required for MRME. 

4 Performance of the Proposed Techniques 

The performance of the proposed techniques has been eval- 
uated with three test sequences — (1) "Miss America** (CIF 
format. 360x288), (2) "Salesman'* (360x288), and 
(3) "Ping-Pong" (360 x 240). "Miss America** and "Sales- 
man** are typical video conferencing sequences with slow 
motion and low spatial detail. On the other hand, "Ping- 
Pong" has high spatial detail and fast motion. The basic video 
coder as shown in Fig. 4 was employed in the simulations. 
The full search algorithm with mean squared error (MSE) as 
the matching criterion has been used to obtain the motion 
vectors. The block sizes of the level-3, level-2. and level- 1 
subimages have been chosen as 3x3, 6x6. and 12 x 12. 
respectively. The maximum allowed displacement for leveI-3 
(S H and W H ) subimages is 4 pixels, and the maximum allowed 
refinements in Ievel-2 (VV 4 ) and level-1 (W 2 ) subimages are 
2 and I pixel, respectively. The peak signal-to-noise ratio 
(PSNR), which is defined as 

PSNR=.0 log 1(1 (^) dB . 

has been employed as a measure of the quality of the recon- 
structed images. The coding performance of various tech- 
niques has been compared with respect to bit rale versus 
PSNR. The bit rate and PSNR were calculated by averaging 
the bit rate and MSE for all the frames (1. P, and B). In order 
to calculate the bit rate, a frame rate of 24 frames/s was 
assumed. The GOP was taken as IBBBBPBBBBPBBBBI 
(i.e.. four B frames between consecutive reference frames). 

The choice of wavelets and the quantization scheme are 
crucial in achieving good performance. In all our simulations, 
we have used the Daubechies-8 tap wavelet, which provides 
good coding performance. 10 The wavelet coefficients (or the 
mot ion -com pen sated error coefficients) were quantized with 
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a fairly simple uniform quantizer. 11 For each level of the 
pyramid (four subimages for the lower-resolution level, and 
three for the higher-resolution levels) one quantizer was used. 
The ratio of the quantization step sizes of the successive levels 
was chosen as 2.0 (increasing from lower-resolution to 
higher-resolution subimages). The quantized coefficients are 
encoded with an arithmetic coder 1 2 to achieve superior coding 
performance. 

The coding performance of AMRME depends on the 
choice of the threshold. Figure 8 compares the coding per- 
formance for various values of threshlfactor. It is observed 
that the best thresh-f actor is in the range 0.6 to 0.8 for most 
sequences. For "Miss America" (and also for "Salesman/' 
which is not shown here), the performance variation is sen- 
sitive to the choice of threshold. "Miss America" (like 
"Salesman"), being a low-entropy sequence, requires less 
bits for encoding. Hence the bit saving due to thresholding 
is able to improve the overall coding performance. However, 
"Ping-Pong," being a high-entropy sequence, requires a 
large number of bits for encoding. Hence, the bit saving due 
to thresholding is not very significant compared to the overall 
bit rates. In all our subsequent simulations, we have used a 
.threshold with thresh-factor equal to 0.7 for all three se- 
quences. 

The coding performance of AMRME (with thresh-fac- 
tor = 0.1) is shown in Fig. 9. It is observed that AMRME 

Pingpong 



25- 




I I I I 

0.7 0.75 0.8 0.85 0.9 

Bit-rate (in Mbits/s) 

(a) 

Miss America 



0.95 



3 3G> 




D il -rate (in Mhils/f) 

(b) 

Fig. 8 Performance comparison of various thresholds. 
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37.25 
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> 
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d p 
v d 




4 


O MRME 

• — <*• AMRME 



0.2 0.3 0.4 0.5 

Bit-rair (in MbiU/s} 

(b) 

Fig. 9 Comparison of MRME and AMRME techniques (with 
thresh—factor ■ = 0.7). 



provides an improvement of more than 1 dB compared to 
MRME- The improvement is more significant for the Ting- 
Pong'" sequence, since the high-pass subimages of the con- 
secutive frames are less correlated due to its fast motion. 

Figure 10 compares the coding performance of B frames 
for the MRME and BMRME techniques (using an adaptive 
threshold in both cases). We observe that BMRME provides 
superior coding performance to MRME. In addition, 
BMRME addresses the uncovered areas better, since an un- 
covered area can be predicted only from the future reference 
frame. Hence, the BMRME technique has a potential to pro- 
vide a superior performance when there is a scene change in 
a video sequence. 

The relative performance of MRME and FMRME tech- 
niques is shown in Fig. 1 1 . In general. FMRME is expected 
to provide poor performance compared to MRME because 
of its coarse motion estimation. However, we note that the 
FMRME requires less motion vectors to be encoded and thus 
saves some bits. In Fig. 1 1 , it is observed that for the "Ping- 
Pong" sequence, MRME provides better coding perfor- 
mance. Since "Ping-Pong" is a sequence with fast motion, 
the corresponding motion vectors from different subimages 
arc less correlated, and hence superior performance is 
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Bit-rate <ia Mbiu/t) 

(a) 



Miss America 



Miss America 
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Fig. 10 Comparison of MRME and BMRME techniques. In all 
cases, a thresh-factor of 0.7 has been used. 



achieved by estimating motion individually for all the sub- 
images. However, it is observed that FMRME provides a 
coding performance comparable to MRME for the "Miss 
America" and "Salesman" sequences. For such slow se- 
quences, there is correlation among the motion vectors of the 
three subimages at any level of the wavelet pyramid. Hence 
the bit-rate saving due to the smaller number of motion vec- 
tors compensates the degradations due to coarse motion es- 
timation. It is also observed that FMRME works better for 
very low-bit-rate coding. At a lower bit rate, FMRME spends 
fewer bits in coding motion vectors and spends the remaining 
bits in encoding the error coefficients. In summary, for slow 
motion sequences, the overall coding performance of 
FMRME is comparable to that of MRME at a significantly 
reduced complexity. 

Finally, the overall coding performance, combining all 
three techniques, is shown in Fig. 12. We observe that an 
improvement of I to 2 dB in coding performance is obtained 
with the proposed techniques, compared to MRME, for all 
the video sequences. We also note thai the overall complexity 
of the proposed techniques is significantly less than for 
MRME. 




~i 1 r 

0.2 0.225 0.25 0 275 0 3 

Bit-rate <io Mbits/s) 

<t>) 

Salesman 



31.5 




30.5- 



m r 

0.2 0.25 0 3 0.35 0.« 

Bn-Tale (in Mbiu/s) 
(C) 

Fig. 11 Comparison of MRME and FMRME techniques. In all cases, 
a thresh^factor of 0.7 has been used. 



5 Conclusions 

In this paper, we have first presented the use of an adaptive 
threshold for coding the motion vectors in high-pass sub- 
images. This threshold improves the coding performance of 
the MRME technique, as there is little correlation among the 
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Fig. 12 Performance of the combined three techniques. 



high -pass coefficients of two neighboring frames. The im- 
provement is more appreciable for last motion sequences. 
Secondly, we have proposed a bidirectional mull iresolut ion 
motion estimation technique for a wavelet transform-based 
video coder. The BMRME technique provides superior cod- 
ing performance (improvement of more than l dB) to the 
MR ME technique. The performance improvement over 



MRMEdue to BMRME is more pronounced for a fast motion 
sequence. The proposed technique can be employed in the 
design of a wavelet-based MPEG coder. Finally, we have 
proposed a fast multiresolution motion estimation (FMRME) 
technique. This technique exploits the correlation among the 
motion vectors of the subimages of a particular wavelet pyr- 
amid and significantly reduces the computational complexity 
(by about 60%). In addition, the number of motion vectors 
is reduced. As a result, FMRME provides superior coding 
performance to MRME, especially for a slow motion se- 
quence. 

Further work can be carried out to extend the MRME and 
BMRME techniques to a generalized wave-packet decom- 
position scheme. In wave packets, the decomposition is image 
adaptive and irregular in nature, resulting in superior coding 
performance. 10 The extension of MRME techniques to wave 
packets has a potential to further improve the coding per- 
formance. 
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